简体   繁体   English

“现代”Web浏览器中正则表达式的最大大小/长度?

[英]Maximum size/length of regular expression in “modern” web browsers?

What's the maximum size of a regular expression in modern browsers (ie Firefox 3+, Safari 4+, IE 7+)? 现代浏览器(即Firefox 3 +,Safari 4 +,IE 7+)中正则表达式的最大大小是多少? Assume a simple regular expression, of, say "foo|bar|baz|woot|..." 假设一个简单的正则表达式,比如说“foo | bar | baz | woot | ...”

You can use this code to test, in IE8 / firefox with firebug / Chrome. 您可以使用此代码在IE8 / firefox中使用firebug / Chrome进行测试。

var regex = "";
var maximum = 100;
var showAfter = 95;
for(i = 1; i < maximum; i++) {
    regex += "aaaaaaaaaa";
    if (i > showAfter) {
        console.log(10 * i + " chars");
        console.log(RegExp(regex));
    }
}

When you get a error, you found the limit. 当您收到错误时,您找到了限制。


SIMPLE TEST 简单的测试

var regex = "";
var chars = 3204161;
for(i = 0; i < chars; i++) {
    regex += "a";
}
alert(chars + " chars");
var a = RegExp(regex); // don't send to console, to be faster

RESULTS 结果

In Firefox 3.6.3 (Ubuntu 32 bits) I get error when I tried a regex with 9M chars (9.999.990 chars) 3.204.161 chars. 在Firefox 3.6.3(Ubuntu 32位)中,当我尝试使用 9M字符(9.999.990字符) 3.204.161字符的正则表达式时出错。 With 3.204.160 it's ok. 有了3.204.160,没关系。

In Chrome 5.0.3 the limit is something between 20M and 25M chars. 在Chrome 5.0.3中,限制在20M到25M之间。

The error, in firefox, is: firefox中的错误是:

script stack space quota is exhausted

Note: If you did some test, please comment here. 注意:如果你做了一些测试,请在这里评论。

Certain regular expressions require exponential amounts of memory to evaluate. 某些正则表达式需要指数量的内存来评估。 Since Firefox does this on the stack, which is limited to 10 MB on many Linux distributions, and even smaller in Windows (at least some versions of Firefox), you could hit the limit fairly quickly if you use a regular expression that requires exponential memory to convert to DFA form to evaluate. 由于Firefox在堆栈上执行此操作,在许多Linux发行版上限制为10 MB,在Windows中甚至更小(至少某些版本的Firefox),如果使用需要指数内存的正则表达式,则可以相当快地达到限制转换为DFA表单进行评估。

If your regular expression is simple like that, why not just have a loop that does string comparisons: 如果你的正则表达式很简单,为什么不只是有一个循环来进行字符串比较:

var input = "woot";

var tests = ["foo", "bar", "baz", "woot"];
for(i = 0; i < tests.length; i++) {
   if (tests[i] == input) {
      alert("match found: #" + i);
      break;
   }
}

Then you don't have to worry about browser limitations, and it'll likely perform much better as a result (since the regular expression version would have to parse and compile the regex, there'd be plenty of back tracking, and so on). 然后你不必担心浏览器的限制,结果它可能会表现得更好(因为正则表达式版本必须解析并编译正则表达式,所以有足够的后向跟踪,等等)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM