简体   繁体   English

正则表达式匹配多个组

[英]Regex match multiple groups

I have the following example of a string with regex which I am trying to match:我有以下我试图匹配的带有正则表达式的字符串示例:

Regex: ^\\d{3}( [0-9a-fA-F]{2}){3}正则表达式: ^\\d{3}( [0-9a-fA-F]{2}){3}

String to match: 010 00 00 00要匹配的字符串: 010 00 00 00

My question is this - the regex matches and captures 1 group - the final 00 at the end of the string.我的问题是 - 正则表达式匹配并捕获 1 个组 - 字符串末尾的最后一个00 However, I want it to match all three of the 00 groups at the end.但是,我希望它最终匹配所有三个00组。 Why doesn't this work?为什么这不起作用? Surely the brackets should mean that they are all matched equally?当然括号应该意味着它们都是平等匹配的吗?

I know that I could just type out the three groups separately but this is just a short extract of a much longer string so that would be a pain.我知道我可以分别输入三个组,但这只是一个更长的字符串的简短摘录,所以会很痛苦。 I was hoping that this would provide a more elegant solution but it seems my understanding is lacking somewhat!我希望这会提供一个更优雅的解决方案,但似乎我的理解有所欠缺!

Thanks!谢谢!

Because you have a quantifier on a capture group, you're only seeing the capture from the last iteration.因为您在捕获组上有一个量词,所以您只能看到上次迭代的捕获。 Luckily for you though, .NET (unlike other implementations) provides a mechanism for retrieving captures from all iterations, via the CaptureCollection class .幸运的是,.NET(与其他实现不同)提供了一种通过CaptureCollection 类所有迭代中检索捕获的机制。 From the linked documentation:从链接的文档:

If a quantifier is applied to a capturing group, the CaptureCollection includes one Capture object for each captured substring, and the Group object provides information only about the last captured substring.如果将量词应用于捕获组,则 CaptureCollection 为每个捕获的子字符串包括一个 Capture 对象,而 Group 对象仅提供有关最后捕获的子字符串的信息。

And the example provided from the linked documentation:以及链接文档中提供的示例:

  // Match a sentence with a pattern that has a quantifier that  
  // applies to the entire group.
  pattern = @"(\b\w+\W{1,2})+";
  match = Regex.Match(input, pattern);
  Console.WriteLine("Pattern: " + pattern);
  Console.WriteLine("Match: " + match.Value);
  Console.WriteLine("  Match.Captures: {0}", match.Captures.Count);
  for (int ctr = 0; ctr < match.Captures.Count; ctr++)
     Console.WriteLine("    {0}: '{1}'", ctr, match.Captures[ctr].Value);

  Console.WriteLine("  Match.Groups: {0}", match.Groups.Count);
  for (int groupCtr = 0; groupCtr < match.Groups.Count; groupCtr++)
  {
     Console.WriteLine("    Group {0}: '{1}'", groupCtr, match.Groups[groupCtr].Value);
     Console.WriteLine("    Group({0}).Captures: {1}", 
                       groupCtr, match.Groups[groupCtr].Captures.Count);
     for (int captureCtr = 0; captureCtr < match.Groups[groupCtr].Captures.Count; captureCtr++)
        Console.WriteLine("      Capture {0}: '{1}'", captureCtr, match.Groups[groupCtr].Captures[captureCtr].Value);
  }

This should work for your current string.这应该适用于您当前的字符串。 I'd need a better example (more strings etc.) to see if this would break for those.我需要一个更好的例子(更多的字符串等)来看看这是否会破坏那些。 The word boundary (\\b) checks for any non-word character:单词边界 (\\b) 检查任何非单词字符:

\b[0-9a-fA-F]{2}\b

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM