简体   繁体   中英

Why does this regex take so long to execute?

I created regex that's supposed to move text inide of an adjoining <span> tag.

const fix = (string) => string.replace(/([\S]+)*<span([^<]+)*>(.*?)<\/span>([\S]+)*/g, "<span$2>$1$3$4</span>")

fix('<p>Given <span class="label">Butter</span>&#39;s game, the tree counts as more than one input.</p>')
// Results in:
'<p>Given <span class="label">Butter&#39;s</span> game, the tree counts as more than one input.</p>'

But if I pass it a string where there is no text touching a <span> tag, it takes a few seconds to run.

I'm testing this on Chrome and Electron .

([\\S]+)* and ([^<]+)* are the culprits that causes catastrophic backtracking when there is no </span> . You need to modify your regex to

([\S]*)<span([^<]*)>(.*?)<\/span>([\S]*)

It will work but its still not efficient .

Why use character class for \\S ? The above reduces to

(\S*)<span([^<]*)>(.*?)<\/span>(\S*)

If you are concerned only about content of span , use this instead

<span([^<]*)>(.*?)<\/span>

Check here <= (See the reduction in number of steps)

NOTE : At last don't parse HTML with regex, if there are tools that can do it much more easily

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM