简体   繁体   中英

How to have two named groups with same name in .net regex?

I am having a regex to identify some named groups. There are a few cases which have multiple groups with different patterns. The problem is to have all named groups into corresponding lists. The constraint is that I cannot have more than one regex and I cannot call execute the regex more than once. I have tried following code, but it always returns second pattern:

        Regex reg = new Regex(@"(?<n1>pattern_n1_1) (?<n2>pattern_n2_1) (?<n1>pattern_n1_2) (?<n2>pattern_n1_2)", RegexOptions.IgnoreCase);

        String str = "pattern_n1_1 pattern_n2_1 pattern_n1_2 pattern_n1_2";

        List<String> matchedText = new List<string>();
        List<String> string_n1 = new List<string>();
        List<String> string_n2 = new List<string>();

        MatchCollection mc = reg.Matches(str);
        if (mc != null)
        {
            foreach (Match m in mc)
            {
                matchedText.Add(m.Value.Trim());
                string_n1.Add(m.Groups["n1"].Value);
                string_n2.Add(m.Groups["n2"].Value);
            }
        }

Here the list string_n1 and string_n2 has one element each. string_n1 has "pattern_n1_2" and string_n2 has "pattern_n2_2". However, I require both "pattern_n1_1" and "pattern_n1_2" to be in string_n1 AND both "pattern_n2_1" and "pattern_n2_2" to be in string_n2

There is no need to change your regex. You only need to change the way you retrieve the result from the capturing groups.

Since you have multiple capturing groups under the same name, in order to retrieve all captures done under that name, you need to loop through all Capture in Groups["n1"].Captures , instead of accessing a single capture with Groups["n1"].Value .

MatchCollection mc = reg.Matches(str);
if (mc != null)
{
    foreach (Match m in mc)
    {
        matchedText.Add(m.Value.Trim());

        foreach (Capture c in m.Groups["n1"].Captures) {
            string_n1.Add(c.Value);
        }

        foreach (Capture c in m.Groups["n2"].Captures) {
            string_n2.Add(c.Value);
        }
    }
}

Demo on ideone

This is a feature, as far as I know, unique to .NET Regex API. None of the other flavors offers an API to go through all matches of a repeated capturing group:

^\w+(?: (\w+))+$

Other flavors only return the last capture for the capturing group 1 in the above example. .NET allows you to extract all captures by a capturing group.

And although there are flavors allowing you to define the same name for different capturing groups, other flavors only allow you to access to one of the captures when querying via the group name.

Reference

Depending on the specifics, the follow may work for your needs, but is not generalized solution:

Regex reg = new Regex(@"((?<n1>(pattern_n1_1|pattern_n1_2)) (?<n2>(pattern_n2_1|pattern_n1_2)) ){2}", RegexOptions.IgnoreCase);

This will capture a bit more than the original, in that pattern_n1_2 would be caught as the fourth "group" in this version for example, but not the original.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM