[英]Regex match not working when string is too long?
I have the following string : 我有以下字符串:
var example = '{%start%}$MOXDATA${"name":"group one","sections":[{"name":"section one","fields":[{"name":"plain one","type":"plain","value":"// some \"plain\" \'\"\"\"\'\' \'\' \' \' \" \" \" tesP(&^%I&63657riu43r3+_)(I)p;l>:\"></}}{|\":1~~``"},{"name":"rich one","type":"rich","value":"<ul>\n<li><span style=\"font-size: 11px;\">{ Lo<span style=\"font-family: \'comic sans ms\', sans-serif;\">rem</span> ipsu<span style=\"color: #ffff00; background-color: #339966;\">m dolor si</span>t amet, consec<strong>tetur adi</strong>piscing elit. Vestibulum ac dolor pulvinar ipsum luctus ullamcorper.</span></li>\n<li></li>\n<li><a href=\"http://retrgfd.com/resrgf\">erwfd\"etrgfdd\'\'refre\"\'\"refrds\'\"\"\"sdgfd</a></li>\n</ul>"},{"name":"repeater one","type":"repeater","value":[[{"name":"plain one","type":"plain","value":"some test value"},{"name":"rich one","type":"rich","value":"some test value"},{"name":"link one","type":"link","value":"some test value"},{"name":"media one","type":"media","value":"some test value"},{"name":"link two","type":"link","value":"some test value"}]]}]},{"name":"section two","fields":[{"name":"link one","type":"link","value":"<a href=\"http://www.yyyy.com\">take me to your leader</a>"}]}]}$MOXDATA${%end%}';
And I'm doing example.match(/{%start%}\\$MOXDATA\\$(.+)\\$MOXDATA\\${%end%}/);
我正在做
example.match(/{%start%}\\$MOXDATA\\$(.+)\\$MOXDATA\\${%end%}/);
which is returning null
. 返回
null
。
However, if I use a significantly shorter version of the above string, as in : 但是,如果我使用上述字符串的短得多的版本,例如:
var shorter = '{%start%}$MOXDATA${"name""}]}]}$MOXDATA${%end%}';
shorter.match(/{%start%}\$MOXDATA\$(.+)\$MOXDATA\${%end%}/);
{"name""}]}]}
is then correctly matched. {"name""}]}]}
然后正确匹配。
Why is that? 这是为什么? What am I doing wrong?
我究竟做错了什么?
Anony-Mousse answer is good and stribizhev comment too. Anony-Mousse的回答很好,同时也给了stribizhev评论。
However, when you have to deal with a long string, you should use something that causes less backtracking ( [^]*
or [\\s\\S]*
will match all characters with newlines until the end of the string and the regex engine must go back character by character until it find $MOXDATA${%end%}
. That's a lot of work.) 但是,当您必须处理长字符串时,应使用减少回溯的方法(
[^]*
或[\\s\\S]*
将使所有字符都换行,直到字符串的末尾和正则表达式引擎必须逐个字符地返回,直到找到$MOXDATA${%end%}
。这是很多工作。)
To avoid this work, you can replace [^]*
or [\\s\\S]*
with: 为了避免这项工作,您可以将
[^]*
或[\\s\\S]*
替换为:
[^$]*(?:\\$+(?!MOXDATA\\${%end%})[^$]*)*
or more robust (if $MOXDATA${%end%}
doesn't exist): 或更强壮(如果
$MOXDATA${%end%}
不存在):
(?=([^$]*))\\1(?=((?:\\$+(?!MOXDATA\\${%end%})[^$]*)*))\\2
( (?=(subpattern))\\1
emulates an atomic group .) (
(?=(subpattern))\\1
模拟一个原子团 。)
In this way the subpattern MOXDATA\\${%end%}
is only tested on each $
. 这样,子模式
MOXDATA\\${%end%}
仅在每个$
上进行测试。
By default, .*
will not match newlines . 默认情况下,
.*
将不匹配换行符 。
Try [^]*
to match really any character. 尝试
[^]*
匹配任何字符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.