基于正则表达式的分割模式

Question

I am trying to write a program to parse Java Garbage Collection logs. 我正在尝试编写一个程序来解析Java垃圾收集日志。 I just created a grammar that matches a minor collection. 我刚刚创建了与次要集合匹配的语法。 Once I have identified a pattern I would like to parse it into individual tokens. 确定了模式后，我想将其解析为单独的标记。 My question is, is there any elegant way to do this with my previously defined grammar? 我的问题是，用我之前定义的语法有什么优雅的方法吗？

public class RegexTestHarness {
  private final static String REGEX_SMALL_COLLECTION = "\\d+\\.\\d+: \\[GC \\d+.\\d+: \\[ParNew: \\d+K\\-\\>0K\\(\\d+K\\), \\d+.\\d+ secs\\] \\d+K\\-\\>\\d+K\\(\\d+K\\), \\d+.\\d+ secs\\]";

  public static void main(String[] args){
    Pattern pattern = Pattern.compile(REGEX_SMALL_COLLECTION);           
    Matcher matcher = pattern.matcher("54.770: [GC 54.770: [ParNew: 5232768K->0K(5237824K), 1.1304192 secs] 5238622K->380448K(10480704K), 1.1306410 secs]");
    while (matcher.find()) {            
      System.out.println(matcher.group(0));
      System.out.println(matcher.start());
      System.out.println(matcher.end()); 
    }
  }
}

Answer 1

You need to add groups to your regex. 您需要将组添加到正则表达式。

private final static String REGEX_SMALL_COLLECTION = "(\\\\d+\\\\.\\\\d+): \\\\[GC (\\\\d+.\\\\d+): \\\\[ParNew: \\\\d+K\\\\-\\\\>0K\\\\(\\\\d+K\\\\), \\\\d+.\\\\d+ secs\\\\] \\\\d+K\\\\-\\\\>\\\\d+K\\\\(\\\\d+K\\\\), \\\\d+.\\\\d+ secs\\\\]";

and then access the groups to the values. 然后访问组中的值。 In the above example, I added parenthesis around the first two items you want -- this tells the regex engine to capture the matching substrings. 在上面的示例中，我在想要的前两个项目周围添加了括号-告诉正则表达式引擎捕获匹配的子字符串。 You will need to add more. 您将需要添加更多。 As you are currently doing, you use Matcher.group() to get each group. 当前，您使用Matcher.group()获取每个组。 Note that group 0 is always the entire match. 请注意，组0始终是整个匹配项。 The rest are numbered from 1 up, in order of their opening parens ( . 其余的从1开始按编号顺序排列( 。

基于正则表达式的分割模式

问题描述

1 个解决方案

解决方案1
1 已采纳 2012-07-26 13:33:12

基于正则表达式的分割模式

问题描述

1 个解决方案

解决方案1 1 已采纳 2012-07-26 13:33:12

解决方案1
1 已采纳 2012-07-26 13:33:12