简体   繁体   中英

Regex with optional matching groups

I'm trying to parse given string which is kind a of path separated with / . I need to write regex that would match each segment in the path to corresponding regex group.

Example 1:

input:

/EAN/SomeBrand/appliances/refrigerators/RF444

output:

Group: producer, Value: SomeBrand Group: category, Value: appliances Group: subcategory, Value: refrigerators Group: product, Value: RF4441

Example 2:

input:

/EAN/SomeBrand/appliances

output:

Group: producer, Value: SomeBrand Group: category, Value: appliances Group: subcategory, Value: Group: product, Value:

I tried following code, it works fine when the path is full (like in the first exmaple) but fails to find the groups when the input string is impartial (like in example 2).

static void Main()
{
  var pattern = @"^" + @"/EAN"
                + @"/" + @"(?<producer>.+)"
                + @"/" + @"(?<category>.+)"
                + @"/" + @"(?<subcategory>.+)"
                + @"/" + @"(?<product>.+)?"
                + @"$";

  var rgx = new Regex(pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
  var result = rgx.Match(@"/EAN/SomeBrand/appliances/refrigerators/RF444");

  foreach (string groupName in rgx.GetGroupNames())
  {
    Console.WriteLine(
       "Group: {0}, Value: {1}",
       groupName,
       result.Groups[groupName].Value);
  }


  Console.ReadLine();
}

Any suggestion is welcome. Unfortunately I cannot simply split the string since the framework I'm using expects regex object.

You can use optional groups (...)? and replace the .+ greedy dot matching patterns with negated character classes [^/]+ :

^/EAN/(?<producer>[^/]+)/(?<category>[^/]+)(/(?<subcategory>[^/]+))?(/(?<product>[^/]+))?$
                                           ^                      ^^^                  ^^

See the regex demo

This is how you need to declare your regex in the C# code:

var pattern = @"^" + @"/EAN"
            + @"/(?<producer>[^/]+)"
            + @"/(?<category>[^/]+)"
            + @"(/(?<subcategory>[^/]+))?"
            + @"(/(?<product>[^/]+))?"
            + @"$";

var rgx = new Regex(pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture);

Note I am using regular capturing groups as optional ones, but the RegexOptions.ExplicitCapture flag turns all non-named capturing groups into non-capturing and thus, they do not appear among the Match.Groups . So, we only have 5 groups all the time even without using non-capturing optional groups (?:...)? .

Try

var pattern = @"^" + @"/EAN"
    + @"(?:/" + @"(?<producer>[^/]+))?"
    + @"(?:/" + @"(?<category>[^/]+))?"
    + @"(?:/" + @"(?<subcategory>[^/]+))?"
    + @"(?:/" + @"(?<product>[^/]+))?";

Note how I replaced the . with [^/] , because you want to use the / to split strings. Note even the use of the optional quantifier for each sub-part ( ? )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM