简体   繁体   English

为什么这段代码卡住了node.js - Javascript上的Bug?

[英]Why this code stuck node.js - Bug on Javascript?

I'm trying to run this regex but it stuck my console. 我正在尝试运行这个正则表达式,但它卡住了我的控制台。 Why? 为什么?

var str = "Шедевры православной музыки - 20 золотых православных песен";
str.match(/^(([\u00C0-\u1FFF\u2C00-\uD7FF]+[^a-z\u00C0-\u1FFF\u2C00-\uD7FF]*)+) [a-z]+[^\u00C0-\u1FFF\u2C00-\uD7FF]*$/i);

Your regex causes catastrophic backtracking (see a demo of your regex here ) due to (([\À-\῿\Ⰰ-\퟿]+[^az\À-\῿\Ⰰ-\퟿]*)+) part. 由于(([\À-\῿\Ⰰ-\퟿]+[^az\À-\῿\Ⰰ-\퟿]*)+)部分,你的正则表达式会导致灾难性的回溯 (参见你的正则表达式演示 (([\À-\῿\Ⰰ-\퟿]+[^az\À-\῿\Ⰰ-\퟿]*)+) As [^az\À-\῿\Ⰰ-\퟿]* can match zero characters, you basically have a classical (a+)+ -like pattern (cf: ([\À-\῿\Ⰰ-\퟿]+)+ ) that causes backtracking issue. 由于[^az\À-\῿\Ⰰ-\퟿]*可以匹配零个字符,所以你基本上有一个经典的(a+)+式模式(cf: ([\À-\῿\Ⰰ-\퟿]+)+ )导致回溯问题。

To get rid of it, you need to make sure the subpatterns are compulsory inside the grouping, and apply a * quantifier to the whole grouping: 要摆脱它,您需要确保子模式在分组中是强制性的,并将*量词应用于整个分组:

^([\u00C0-\u1FFF\u2C00-\uD7FF]+(?:[^a-z\u00C0-\u1FFF\u2C00-\uD7FF]+[\u00C0-\u1‌​FFF\u2C00-\uD7FF]+)*) [a-z]+[^\u00C0-\u1FFF\u2C00-\uD7FF]*$

See regex demo 请参阅正则表达式演示

Here, [\À-\῿\Ⰰ-\퟿]+(?:[^az\À-\῿\Ⰰ-\퟿]+[\À-\\u1‌​FFF\Ⰰ-\퟿]+)* matches: 在这里, [\À-\῿\Ⰰ-\퟿]+(?:[^az\À-\῿\Ⰰ-\퟿]+[\À-\\u1‌​FFF\Ⰰ-\퟿]+)*火柴:

  • [\À-\῿\Ⰰ-\퟿]+ - one or more character from [\À-\῿\Ⰰ-\퟿] ranges [\À-\῿\Ⰰ-\퟿]+ - [\À-\῿\Ⰰ-\퟿]范围内的一个或多个字符
  • (?:[^az\À-\῿\Ⰰ-\퟿]+[\À-\\u1‌​FFF\Ⰰ-\퟿]+)* - zero or more sequences of: (?:[^az\À-\῿\Ⰰ-\퟿]+[\À-\\u1‌​FFF\Ⰰ-\퟿]+)* - 零个或多个序列:
    • [^az\À-\῿\Ⰰ-\퟿]+ - one or more characters other than those from the az\À-\῿\Ⰰ-\퟿ ranges [^az\À-\῿\Ⰰ-\퟿]+ - 除az\À-\῿\Ⰰ-\퟿范围以外的一个或多个字符
    • [\À-\\u1‌​FFF\Ⰰ-\퟿]+ - one or more characters from the \À-\\u1‌​FFF\Ⰰ-\퟿ ranges. [\À-\\u1‌​FFF\Ⰰ-\퟿]+ - 来自\À-\\u1‌​FFF\Ⰰ-\퟿范围内的一个或多个字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM