简体   繁体   English

在 C# 中获取重叠的正则表达式匹配

[英]Getting overlapping regex matches in C#

I have the regex 1(0*)1 and the test string 1000010001我有正则表达式1(0*)1和测试字符串1000010001

I want to have 2 matches, but I find that only 1 gets found :我想要 2 个匹配项,但我发现只找到 1 个匹配项:

var regex = new Regex("1(0*)1");
var values = regex.Matches(intBinaryString);
// values only has 1 match

regexonline seems to agree : https://regex101.com/r/3J9Qxj/1 regexonline 似乎同意: https ://regex101.com/r/3J9Qxj/1

What am I doing wrong?我究竟做错了什么?

You are already selecting the 1 in front of the second zero by the first match.您已经在第一场比赛中选择了第二个零前面的 1。

100001 0001
^^^^^^

This is the first match.这是第一场比赛。 The rest is just 0001 which does not match your regex.其余的只是0001与您的正则表达式不匹配。


You can circumvent this behavior if you are using lookaheads/lookbehinds:如果您使用前瞻/后视,您可以规避这种行为:

(?<=1)(0*)(?=1)

Live example活生生的例子


Because you cannot use lookbehinds in JavaScript, it is enough to only use one lookahead, to prevent the overlapping:因为你不能在 JavaScript 中使用lookbehinds,所以只使用一个lookahead就足够了,以防止重叠:

1(0*)(?=1)

Live example活生生的例子


And a hint for your regex101 example: You did not add the global flag, which prevents more than one selection.以及对您的regex101示例的提示:您没有添加全局标志,这会阻止多个选择。

You need to match overlapping strings.您需要匹配重叠的字符串。

It means you should wrap your pattern with a capturing group ( ( + your pattern + ) ) and put this consuming pattern into a positive lookahead, then match all occurrences and grab Group 1 value:这意味着你应该用一个捕获组( ( +你的模式+ ) )包装你的模式,并将这个消耗模式放入一个积极的前瞻中,然后匹配所有出现的情况并获取组 1 值:

(?=(YOUR_REGEX_HERE))

Use

var regex = new Regex("(?=(10*1))");
var values = regex.Matches(intBinaryString)
    .Cast<Match>()
    .Select(m => m.Groups[1].Value)
    .ToList();

See the regex demo查看正则表达式演示

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM