![](/img/trans.png)
[英]How can I make this regular expression not result in “catastrophic backtracking”?
[英]How can I make the regular expression not result in “catastrophic backtracking”?
當我嘗試在javascript中運行下面的代碼時,瀏覽器因為災難性的回溯而掛起,因為設計不良的正則表達式可能無限循環。 我需要一個替代表達式或一種方法來防止這個問題:
string temp = "Testing robustness {parent-area-identifier Some text in between the tokens {parent-area-label}";
var strRegExp = new RegExp(/[{](?:[^{}]+|[{][^{}]*[}])*[}]/g);
var arrMatch = temp.match(strRegExp);
你的正則表達式看起來似乎是為了匹配平衡的大括號,這些大括號內嵌有更平衡的對,但只有一個深度。 這個正則表達式沒有掛在格式錯誤的輸入上:
{[^{}]*(?:{[^{}]*}[^{}]*)*}
這是Jeffrey Friedl的展開循環技術的一個例子。 當第一個[^{}]*
用完非支撐字符時,下一部分會嘗試匹配一個簡單的非嵌套支撐對,然后返回尋找非支撐。 該部分循環以允許多個嵌套的括號對(但都在同一級別)。
這可能看起來更容易受到災難性的回溯(嵌套量詞,一切都是可選的),但是它起作用,因為它永遠不必回溯,即使不可能匹配。
順便說一句,只要看起來你不想將它們用作量詞的一部分,你就不需要逃避括號。 (在某些版本中你需要逃避左括號,但不是JavaScript。)
此外,如果你想匹配嵌套到未知深度的大括號,那你就不走運了。 有些風格可以管理,但JavaScript太有限了。
如果要選擇沒有大括號的區域,請嘗試使用此方法:
var temp = "{=rankedArea?metricType=3902&area={parent-area-identifier}:AdministrativeWard} {=rankedArea?metricType=3902&area={parent-area-identifier}:{ward-type-identifier}} {district-short-label} adfasdfasdfasdf asdf asdf asdf asdf {child-area-short-label} asdf asdf asdf {authority-area-short-label} asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=compare?metricType=3343&greater=greater than&equal=equal to&less=less than} asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=countAreas?area={ancestor-2-identifier}:{ancestor-1-type-identifier}} asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=equivalent?metricDimension=[218][218_Number][Specificethnicity][Ethnicity_AsianorAsianBritish]} asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf {=metricTypeMetadata?metricType=3341&returnValue=source} asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=value?metricType=3284} asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=percent?metricType=518} asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=rank?metricType=3287} asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=rankedArea?metricType=3286} asdfasdfasdfasdf asdf";
var strRegExp = new RegExp(/{(?:[^{}]+|{[^{}]*})*}/g);
var arrMatch = temp.match(strRegExp);
console.log(arrMatch.length);
console.log(arrMatch);
結果:
13
["{=rankedArea?metricType=3902&area={parent-area-identifier}:AdministrativeWard}",
"{=rankedArea?metricType=3902&area={parent-area-identifier}:{ward-type-identifier}}",
"{district-short-label}",
"{child-area-short-label}",
"{authority-area-short-label}",
"{=compare?metricType=3343&greater=greater than&equal=equal to&less=less than}",
"{=countAreas?area={ancestor-2-identifier}:{ancestor-1-type-identifier}}",
"{=equivalent?metricDimension=[218][218_Number][Specificethnicity][Ethnicity_AsianorAsianBritish]}",
"{=metricTypeMetadata?metricType=3341&returnValue=source}", "{=value?metricType=3284}",
"{=percent?metricType=518}",
"{=rank?metricType=3287}",
"{=rankedArea?metricType=3286}"]
它運行速度很快,如果此算法不正確,請提供更多測試用例。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.