End-of-string regex match too slow

Question

Demo here . The regex:

([^>]+)$

I want to match text at the end of a HTML snippet that is not contained in a tag (ie, a trailing text node). The regex above seems like the simplest match, but the execution time seems to scale linearly with the length of the match-text (and has causes hangs in the wild when used in my browser extension). It's also equally slow for matching and non-matching text.

Why is this seemingly simple regex so bad?

(I also tried RegexBuddy but can't seem to get an explanation from it.)

Edit: Here's a snippet for testing the various regexes (click "Run" in the console area).
Edit 2: And a no-match test .

Answer 1

Consider an input like this

abc<def>xyz

With your original expression, ([^>]+)$ , the engine starts from a , fails on > , backtracks, restarts from b , then from c etc. So yes, the time grows with size of the input. If, however, you force the engine to consume everything up to the rightmost > first, as in:

.+>([^>]+)$

the backtracking will be limited by the length of the last segment, no matter how much input is before it.

The second expression is not equivalent to the first one, but since you're using grouping, it doesn't matter much, just pick matches[1] .

Hint: even when you target javascript, switch to the pcre mode, which gives you access to the step info and debugger:

(look at the green bars!)

Answer 2

You could use the actual DOM instead of Regex, which is time consuming:

 var html = "<div><span>blabla</span></div><div>bla</div>Here I am !"; var temp = document.createElement('div'); temp.innerHTML = html; var lastNode = temp.lastChild || false; if(lastNode.nodeType == 3){ alert(lastNode.nodeValue); }

End-of-string regex match too slow

Question

2 answers

solution1
3 ACCPTED 2015-08-11 22:06:13

solution2
1 2015-08-11 21:45:12

End-of-string regex match too slow

Question

2 answers

solution1 3 ACCPTED 2015-08-11 22:06:13

solution2 1 2015-08-11 21:45:12

solution1
3 ACCPTED 2015-08-11 22:06:13

solution2
1 2015-08-11 21:45:12