简体   繁体   English

C#Regex多重匹配

[英]C# Regex multi match

I want (in C#) to check the syntax and extract some data from a string. 我希望(在C#中)检查语法并从字符串中提取一些数据。 Check if the string contains: " someWord IS someWord( OR someWord){1-infinite} " And extract every words and for the first word, name the group "switch" 检查字符串是否包含:“ someWord IS someWord( OR someWord){1-infinite} ”并提取每个单词,并为第一个单词命名组“switch”

This is my string : 这是我的字符串:

string text = "[bird] IS blue OR yellow OR green";

So I use this regex 所以我使用这个正则表达式

string switchPattern = @"\s*(?<switch>.+?)\s+IS\s+(.+?)(?:\s+OR\s+(.+?))+$";

And extract with 并提取

Match switchCaseMatch = Regex.Match(text, switchCaseOperatorPattern);

This give me a group with 4 elements 这给了我一个有4个元素的小组

[0]: [bird] IS blue OR yellow OR green
[1]: green
[2]: blue
[3]: [bird]  named switch

but I want 但我想要

[0]: [bird] IS blue OR yellow OR green
[1]: green
[2]: yellow
[3]: blue
[4]: [bird]  named switch

I hoped that the last " (.+?) " will create a group for all matching cases, but it create only one, for the last occurence. 我希望最后一个“ (.+?) ”将为所有匹配的情况创建一个组,但它只创建一个,用于最后一次出现。 I try with Regex.Matches with the same result. 我尝试使用相同结果的Regex.Matches

I know that I could do it with two regex (a Regex.Match then Regex.Matches for the " someWord( OR someWord){1-infinite} "), but I want to know if is it possible to do it with only one regex. 我知道我可以使用两个正则表达式(一个Regex.Match然后Regex.Matches为“ someWord( OR someWord){1-infinite} ”),但我想知道是否可以只使用一个正则表达式。

Thanks 谢谢

Actually you can do it with Regex.Match , using Captures as I said in my comment. 其实你可以做到这一点Regex.Match ,使用Captures正如我在我的评论说。 Here is a code sample: 这是一个代码示例:

        string text = "[bird] IS blue OR yellow OR green";
        string switchPattern = @"\s*(?<switch>.+?)\s+IS\s+(.+?)(?:\s+OR\s+(.+?))+$";

        Match switchCaseMatch = Regex.Match(text, switchPattern);
        foreach (Group group in switchCaseMatch.Groups)
        {
            if (group.Captures.Count == 1)
                Console.WriteLine(group.Value);
            else foreach (Capture cap in group.Captures)
                    Console.WriteLine(cap.Value);
        }

This results in: 这导致:

[bird] IS blue OR yellow OR green
blue
yellow
green
[bird]

See the Microsoft MSDN page for Captures for more information 有关更多信息,请参阅CapturesMicrosoft MSDN页面

I think using groups will be difficult because you need to anticipate how many groups you will have. 我认为使用小组很困难,因为你需要预测你将拥有多少个小组。 I suggest using the Matches method and MatchCollection instead. 我建议改用Matches方法和MatchCollection You'll have access to the named group inside of there as well as capturing all occurrences of the target strings you are after. 您可以访问其中的命名组,也可以捕获您所追求的所有目标字符串。

eg 例如

string text = "[bird] IS blue OR yellow OR green";
string switchPattern = @"(?<=(?<switch>\S+)\s+IS.*?)(\w+(?=\s+OR)|(?<=OR\s+)\w+)";
MatchCollection switchCaseMatch = Regex.Matches(text, switchPattern);

foreach (Match m in switchCaseMatch)
{
    Console.WriteLine(m.Groups["switch"].Value);
    Console.WriteLine(m.Value);
}

You construct the pattern to use an un-bounded lookbehind to search for the switch (group) text. 您构造模式以使用无限制的lookbehind来搜索switch (组)文本。 This will force every occurrence of a color to follow that text. 这将强制每次出现的颜色都遵循该文本。 The dot-star in that lookbehind will consume all color texts that have been captured in previous iterations. 该外观中的点星将消耗在先前迭代中捕获的所有颜色文本。 Then, you use either a lookahead to find the first color (by ensuring that "OR" follows the color) or a lookbehind to find all subsequent colors (by ensuring the "OR" precedes the color. Then it's just a matter of evaluating the Value property of each Match object in the MatchCollection . The named group will be captured in each Match so you'll have access to that as well. 然后,你使用前瞻来找到第一种颜色(通过确保“OR”跟随颜色)或者使用lookbehind来找到所有后续颜色(通过确保“OR”在颜色之前。然后它只是评估Value每个属性Match的对象MatchCollection 。命名组将在每个被捕获Match ,所以你将有机会获得这一点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM