[英]How to match an even number of any character in a string?
I have a string: 我有一个字符串:
aaabbashasccddee
And I want to get matches of even number of consecutive same characters. 而且我希望获得偶数个连续相同角色的匹配。 For example, from the above string, I want these matches:
例如,从上面的字符串,我想要这些匹配:
[bb],[cc],[dd],[ee]
I have tried this solution but it's not even close: 我试过这个解决方案,但它甚至没有关闭:
"^(..)*$
Any help please 请帮忙
Fortunately .NET regular expressions are capable of handling infinite lookbehinds. 幸运的是,.NET正则表达式能够处理无限的外观。 What you need could be achieved using the following regex:
您需要的是使用以下正则表达式:
((?>(?(2)(?=\2))(.)\2)+)(?<!\2\1)(?!\2)
See live demo here 在这里查看现场演示
Regex breakdown: 正则表达式细分:
(
Start of capturing group #1 (
开始捕获组#1
(?>
Start of non-capturing group (atomic) (?>
非捕获组的开始(原子)
(?(2)
If capturing group #2 is set (?(2)
如果设置了捕获组#2
(?=\\2)
Next character should be it (?=\\2)
下一个字符应该是它 )
End f conditional )
结束有条件的 (.)\\2
Match and capture a character and match it again (even number) (.)\\2
匹配并捕获一个字符并再次匹配(偶数) )+
Repeat as much as possible, at least once )+
尽可能重复,至少一次 )
End of capturing group #1 )
捕获组#1结束 (?<!\\2\\1)
Here is the trick. (?<!\\2\\1)
这是诀窍。 The lookbehind tells engine that the immediate preceding character that comes earlier than what we matched so far shouldn't be the same character stored in capturing group #2 (?!\\2)
Next character shouldn't be the same as the character stored in capturing group #2 (?!\\2)
下一个字符不应与捕获组#2中存储的字符相同 UPDATE: 更新:
So you can do following code in C# to get all even sequences chars in string by Regex
with no any other operators (pure Regex). 因此,您可以在C#中执行以下代码,以便通过
Regex
获取字符串中的所有偶数序列字符,而不使用任何其他运算符(纯正的Regex)。
var allEvenSequences = Regex.Matches("aaabbashasccddee", @"((?>(?(2)(?=\2))(.)\2)+)(?<!\2\1)(?!\2)").Cast<Match>().ToList();
Also if you want to make [bb],[cc],[dd],[ee]
then you can join that sequence array: 此外,如果你想制作
[bb],[cc],[dd],[ee]
那么你可以加入该序列数组:
string strEvenSequences = string.Join(",", allEvenSequence.Select(x => $"[{x}]").ToArray());
//strEvenSequences will be [bb],[cc],[dd],[ee]
Another possible regex-only solution that doesn't involve conditionals: 另一种可能不涉及条件的正则表达式解决方案:
(.)(?<!\1\1)\1(?:\1\1)*(?!\1)
Breakdown: 分解:
(.) # First capturing group - matches any character.
(?<!\1\1) # Negative lookbehind - ensures the matched char isn't preceded by the same char.
\1 # Match another one of the character in the 1st group (at least two in total).
(?:\1\1) # A non-capturing group that matches two occurrences of the same char.
* # Matches between zero and unlimited times of the previous group.
(?!\1) # Negative lookahead to make sure no extra occurrence of the char follows.
Demo: 演示:
string input = "aaabbashasccddee";
string pattern = @"(.)(?<!\1\1)\1(?:\1\1)*(?!\1)";
var matches = Regex.Matches(input, pattern);
foreach (Match m in matches)
Console.WriteLine(m.Value);
Output: 输出:
bb
cc
dd
ee
Try it online . 在线尝试 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.