简体   繁体   English

正则表达式。替换在勉强匹配时有奇怪的行为

[英]Regex.Replace has strange behavior at reluctant match

Answering to this question I stuck with this situation. 在回答这个问题时,我坚持这种情况。 Using reluctant match in my regex bring to this result 在我的正则表达式中使用勉强匹配会带来此结果

string s = Regex.Replace(".A.", "\\w*?", "B");

B.BAB.B B.BAB.B

Why it doesn't match and replace A? 为什么不匹配并替换A?

Because the \\\\w*? 因为\\\\w*? matches as few \\w as it possibly can, including 0 of them . 尽可能匹配\\w包括0

Since you have \\w* instead of \\w+ , the regex matches 0 or more \\w . 由于您使用的是\\w*而不是\\w+ ,因此正则表达式匹配0或多个\\w

Since you have an additional ? 既然您还有其他? on the \\w* , the smallest possible match for this regex is the 0-length string, ''. \\w* ,此正则表达式的最小匹配项是长度为0的字符串''。

Since the ? 自从? forces the regex to match as small a match as possible, it only ever matches 0-length strings. 迫使正则表达式匹配尽可能小的匹配越好,它永远只能匹配0长度的字符串。 It can't match a single character A because that would be a longer match than the shortest. 它不能匹配单个字符A因为匹配时间比最短字符更长。

Hence all 0-length strings in .A. 因此, .A.所有0长度字符串.A. (being: ''.''A''.'' , where each possible 0-length string is marked as '' ) are replaced with a 'B', giving you 'BAB'. (即: ''.''A''.'' ,其中每个可能的0长度字符串都标记为'' )替换为'B',从而得到'BAB'。

If you want to disable this behaviour and replace at least one \\w , you can use regex \\w+? 如果要禁用此行为并替换至少一个\\w ,则可以使用正则表达式\\w+? . However, by the same reasoning as before, the ? 但是,根据与以前相同的理由, ? forces this to only ever replace \\w of length one, so you may as well use regex \\w . 强制此选项只替换长度为\\w\\w ,因此您也可以使用regex \\w

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM