简体   繁体   English

Ruby正则表达式的优先级?

[英]Precedence of Ruby regular expressions?

I am reviewing regular expressions and cannot understand why a regular expression won't match a given string, specifically: 我正在检查正则表达式,无法理解为什么正则表达式与给定字符串不匹配,具体来说:

regex = /(ab*)+(bc)?/ 
mystring  = "abbc"

The match matches "abb" but leaves the c off. 匹配匹配"abb"但离开c I tested this using Rubular and in IRB and don't understand why the regex doesn't match the entire string. 我使用Rubular和IRB测试了这个,并且不明白为什么正则表达式与整个字符串不匹配。 I thought that (ab*)+ would match "ab" and then (bc)? 我以为(ab*)+会匹配"ab"然后(bc)? would match "bc" . 会匹配"bc"

Am I missing something in terms of precedence for regular expression operations? 我是否缺少正则表达式操作的优先级?

Regular expressions try to match the first part of the regular expression as much as possible by default, and they do not backtrack to try to make larger sections match if they don't have to. 正则表达式尝试在默认情况下尽可能地匹配正则表达式的第一部分,并且如果不需要,它们不会回溯以尝试使更大的部分匹配。 Since you make (bc) optional, the (ab*) can match as much as it wants (the non-zero repetition after it doesn't have much to do) and doesn't try backtracking to try other matching alternatives. 由于你使(bc)可选, (ab*)可以匹配它想要的(非零重复之后没有太多事情要做)并且不会尝试回溯来尝试其他匹配的替代方案。

If you want the whole string to be matched (which will force some backtracking in this case) make sure you anchor both ends of the string: 如果你想匹配整个字符串(在这种情况下会强制进行一些回溯),请确保锚定字符串的两端:

regex = /^(ab*)+(bc)?$/ 

The regex with parenthesis assumes you have two matches in your string. 带括号的正则表达式假设您的字符串中有两个匹配项。

The first one is abb because (ab*) means a and zero or more b . 第一个是abb因为(ab*)表示a和0或更多b You have two b , so the match is abb . 你有两个b ,所以匹配是abb Then you have only c in your string, so it doesn't match the second condition which is bc . 那么你的字符串中只有c ,所以它与bc的第二个条件不匹配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM