d的效率低于[0-9]我昨天對(duì)有人用[0123456789]在.正則表達(dá)式而不是[0-9]或\d..我說使用范圍或數(shù)字說明符可能比字符集更有效。今天我決定對(duì)這個(gè)問題進(jìn)行測(cè)試,并意外地發(fā)現(xiàn)(至少在C#regex引擎中)。\d似乎比另外兩個(gè)人效率低,而這兩者似乎并沒有太大的不同。下面是我的測(cè)試輸出,超過10000個(gè)隨機(jī)字符串的1000個(gè)隨機(jī)字符,其中5077實(shí)際上包含一個(gè)數(shù)字:Regular expression \d took 00:00:00.2141226 result: 5077/10000Regular expression [0-9]
took 00:00:00.1357972 result: 5077/10000 63.42 % of firstRegular expression [0123456789] took 00:00:00.1388997
result: 5077/10000 64.87 % of first這對(duì)我來說是個(gè)驚喜,有兩個(gè)原因:我原以為這個(gè)范圍會(huì)比設(shè)定有效得多。我不明白為什么\d比[0-9]..還有更多的\d而不僅僅是簡(jiǎn)單的[0-9]?下面是測(cè)試代碼:using System;using System.Collections.Generic;using System.Linq;using System.Text;using System.Diagnostics;
using System.Text.RegularExpressions;namespace SO_RegexPerformance{
class Program
{
static void Main(string[] args)
{
var rand = new Random(1234);
var strings = new List<string>();
//10K random strings
for (var i = 0; i < 10000; i++)
{
//Generate random string
var sb = new StringBuilder();
for (var c = 0; c < 1000; c++)
{
//Add a-z randomly
sb.Append((char)('a' + rand.Next(26)));
}
//In roughly 50% of them, put a digit
if (rand.Next(2) == 0)
{
//Replace one character with a digit, 0-9
sb[rand.Next(sb.Length)] = (char)('0' + rand.Next(10));
}
strings.Add(sb.ToString());
}
var baseTime = testPerfomance(strings, @"\d");
Console.WriteLine();
var testTime = testPerfomance(strings, "[0-9]");
Console.WriteLine(" {0:P2} of first", testTime.TotalMilliseconds / baseTime.TotalMilliseconds);
testTime = testPerfomance(strings, "[0123456789]");
Console.WriteLine(" {0:P2} of first", testTime.TotalMilliseconds / baseTime.TotalMilliseconds);
}
3 回答

HUWWW
TA貢獻(xiàn)1874條經(jīng)驗(yàn) 獲得超12個(gè)贊
var rex = new Regex(regex, RegexOptions.ECMAScript);
Regex \d took 00:00:00.1355787 result: 5077/10000Regex [0-9] took 00:00:00.1360403 result: 5077/10000 100.34 % of firstRegex [0123456789] took 00:00:00.1362112 result: 5077/10000 100.47 % of first
- 3 回答
- 0 關(guān)注
- 718 瀏覽
添加回答
舉報(bào)
0/150
提交
取消