Why does /^(.+)+Q$/.test(“XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX”) take so long?

Question

When I run

/^(.+)+Q$/.test("XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX")

in Chrome or IE, it takes ~10 seconds to complete. (Firefox is able to evaluate it almost instantly.)

Why does it take so long? (And why/how is Firefox able to do it so quickly?)

(Of course, I'd never run this particular regex, but I'm hitting a similar issue with the URL regex at http://daringfireball.net/2010/07/improved_regex_for_matching_urls and it seems to boil down to this, ie there are certain URLs which will cause the browser to lock up)

For example:

var re = /\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/i;
re.test("http://google.com/?q=(AAAAAAAAAAAAAAAAAAAAAAAAAAAAA")

Answer 1

As indicated by thg435, it sounds like catastrophic back-tracking. There's an excellent article on this, Regular Expression Matching Can Be Simple And Fast .

It describes an efficient approach known as Thompson NFA. As noted, though, this does not support all features of modern regexes. For instance, it can't do backreferences. However, as suggested in the article:

"Even so, it would be reasonable to use Thompson's NFA simulation for most regular expressions, and only bring out backtracking when it is needed. A particularly clever implementation could combine the two, resorting to backtracking only to accommodate the backreferences."

I suspect Firefox may be doing this.

Why does /^(.+)+Q$/.test(“XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX”) take so long?

Question

1 answers

solution1
8 ACCPTED 2012-04-18 21:29:58

Why does /^(.+)+Q$/.test(“XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX”) take so long?

Question

1 answers

solution1 8 ACCPTED 2012-04-18 21:29:58

solution1
8 ACCPTED 2012-04-18 21:29:58