简体   繁体   English

正则表达需要很长时间才能完成

[英]Regex takes a long time to complete

I wanted to match a line that is structured like this: 我想匹配一个结构如下的行:

  • Start of line 行开始
  • Multiple '-' 多 '-'
  • Maybe a white space (maybe more) 也许是一个白色空间(可能更多)
  • At least one character 至少一个角色
  • Maybe more characters and white spaces 也许更多的人物和白色空间
  • Maybe a white space (maybe more) 也许是一个白色空间(可能更多)
  • Multiple '-' 多 '-'
  • End of line 行结束

So I wrote the Regex like this: 所以我写了这样的正则表达式:

new Regex(@"^\-{2,}\s*(\w+(\w+|\s)*)\s*\-{2,}$");

And when I try to match the following line, this takes ages to complete (didn't wait for it to complete): 当我尝试匹配以下行时,这需要很长时间才能完成(不等待它完成):

-------- Variable used for recipe visualization only - Not loaded into PLC --------

I think there's a very big number of matches in it and the Regex have hard time enumerating all those matches but I'm not sure. 我认为其中有很多比赛,而且Regex很难列举所有这些比赛,但我不确定。

Environment information: Windows 7, framework 3.5 环境信息:Windows 7,框架3.5

Thank you 谢谢

Edit: Thanks to your help I came up with a Regex that works: 编辑:感谢您的帮助,我提出了一个有效的正则表达式:

^-{2,}\s*(?!\-)(\w(?:\w|\s|\-)+)(?<!\-)\s*-{2,}$

So the interpretation: 所以解释:

  • Start of line 行开始
  • At least two '-' 至少两个' - '
  • Maybe a white space (maybe more) 也许是一个白色空间(可能更多)
  • No more '-' 不再 '-'
  • At least one character 至少一个角色
  • Maybe more characters, white spaces or '-' 也许更多的人物,白色空间或' - '
  • No more '-' 不再 '-'
  • Maybe a white space (maybe more) 也许是一个白色空间(可能更多)
  • At least two '-' 至少两个' - '
  • End of line 行结束

If you see something wrong with it please tell me 如果你看到它有问题请告诉我

Unroll the nested grouping as 将嵌套分组展开为

^-{2,}\s*(\w+(?:\s+\w+)*)\s*-{2,}$
             ^^^^^^^^^^^ 

Otherwise, your pattern will be prone to catastrophic backtracking . 否则,您的模式将容易发生灾难性的回溯

See the regex demo 请参阅正则表达式演示

Alternatively, use an atomic group to disable any backtracking into the alternation group: 或者,使用原子组禁用任何回溯到交替组:

^-{2,}\s*((?>\w+(?:\w+|\s)*))\s*-{2,}$
          ^^^              ^ 

See this regex demo 看到这个正则表达式演示

Generally, avoid alternations with nested quantifiers (like in (\\w+|\\s)* ) inside longer patterns. 通常,避免在较长的模式中使用嵌套量词(如(\\w+|\\s)* )中的替换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM