簡體   English   中英

如何使正則表達式不會導致“災難性的回溯”?

[英]How can I make the regular expression not result in “catastrophic backtracking”?

當我嘗試在javascript中運行下面的代碼時,瀏覽器因為災難性的回溯而掛起,因為設計不良的正則表達式可能無限循環。 我需要一個替代表達式或一種方法來防止這個問題:

string temp = "Testing robustness {parent-area-identifier Some text in between the tokens {parent-area-label}";
var strRegExp = new RegExp(/[{](?:[^{}]+|[{][^{}]*[}])*[}]/g);
var arrMatch = temp.match(strRegExp);

你的正則表達式看起來似乎是為了匹配平衡的大括號,這些大括號內嵌有更平衡的對,但只有一個深度。 這個正則表達式沒有掛在格式錯誤的輸入上:

{[^{}]*(?:{[^{}]*}[^{}]*)*}

這是Jeffrey Friedl的展開循環技術的一個例子。 當第一個[^{}]*用完非支撐字符時,下一部分會嘗試匹配一個簡單的非嵌套支撐對,然后返回尋找非支撐。 該部分循環以允許多個嵌套的括號對(但都在同一級別)。

這可能看起來更容易受到災難性的回溯(嵌套量詞,一切都是可選的),但是它起作用,因為它永遠不必回溯,即使不可能匹配。

順便說一句,只要看起來你不想將它們用作量詞的一部分,你就不需要逃避括號。 (在某些版本中你需要逃避左括號,但不是JavaScript。)

此外,如果你想匹配嵌套到未知深度的大括號,那你就不走運了。 有些風格可以管理,但JavaScript太有限了。

如果要選擇沒有大括號的區域,請嘗試使用此方法:

var temp = "{=rankedArea?metricType=3902&area={parent-area-identifier}:AdministrativeWard} {=rankedArea?metricType=3902&area={parent-area-identifier}:{ward-type-identifier}} {district-short-label}  adfasdfasdfasdf asdf asdf asdf asdf {child-area-short-label}  asdf asdf asdf  {authority-area-short-label} asdfasdfasdfasdf asdf  asdfasdfasdfasdf asdf{=compare?metricType=3343&greater=greater than&equal=equal to&less=less than}  asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=countAreas?area={ancestor-2-identifier}:{ancestor-1-type-identifier}}  asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=equivalent?metricDimension=[218][218_Number][Specificethnicity][Ethnicity_AsianorAsianBritish]}  asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf {=metricTypeMetadata?metricType=3341&returnValue=source}  asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=value?metricType=3284}  asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=percent?metricType=518}  asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=rank?metricType=3287}  asdfasdfasdfasdf asdf asdfasdfasdfasdf asdf{=rankedArea?metricType=3286}  asdfasdfasdfasdf asdf";
var strRegExp = new RegExp(/{(?:[^{}]+|{[^{}]*})*}/g);
var arrMatch = temp.match(strRegExp);
console.log(arrMatch.length);
console.log(arrMatch);

結果:

13
["{=rankedArea?metricType=3902&area={parent-area-identifier}:AdministrativeWard}",
 "{=rankedArea?metricType=3902&area={parent-area-identifier}:{ward-type-identifier}}",
 "{district-short-label}",
 "{child-area-short-label}",
 "{authority-area-short-label}",
 "{=compare?metricType=3343&greater=greater than&equal=equal to&less=less than}",
 "{=countAreas?area={ancestor-2-identifier}:{ancestor-1-type-identifier}}",
 "{=equivalent?metricDimension=[218][218_Number][Specificethnicity][Ethnicity_AsianorAsianBritish]}",
 "{=metricTypeMetadata?metricType=3341&returnValue=source}", "{=value?metricType=3284}",
 "{=percent?metricType=518}",
 "{=rank?metricType=3287}",
 "{=rankedArea?metricType=3286}"]

它運行速度很快,如果此算法不正確,請提供更多測試用例。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM