Splitting pattern based on Regular Expressions

Question

I am trying to write a program to parse Java Garbage Collection logs. I just created a grammar that matches a minor collection. Once I have identified a pattern I would like to parse it into individual tokens. My question is, is there any elegant way to do this with my previously defined grammar?

public class RegexTestHarness {
  private final static String REGEX_SMALL_COLLECTION = "\\d+\\.\\d+: \\[GC \\d+.\\d+: \\[ParNew: \\d+K\\-\\>0K\\(\\d+K\\), \\d+.\\d+ secs\\] \\d+K\\-\\>\\d+K\\(\\d+K\\), \\d+.\\d+ secs\\]";

  public static void main(String[] args){
    Pattern pattern = Pattern.compile(REGEX_SMALL_COLLECTION);           
    Matcher matcher = pattern.matcher("54.770: [GC 54.770: [ParNew: 5232768K->0K(5237824K), 1.1304192 secs] 5238622K->380448K(10480704K), 1.1306410 secs]");
    while (matcher.find()) {            
      System.out.println(matcher.group(0));
      System.out.println(matcher.start());
      System.out.println(matcher.end()); 
    }
  }
}

Answer 1

You need to add groups to your regex.

private final static String REGEX_SMALL_COLLECTION = "(\\\\d+\\\\.\\\\d+): \\\\[GC (\\\\d+.\\\\d+): \\\\[ParNew: \\\\d+K\\\\-\\\\>0K\\\\(\\\\d+K\\\\), \\\\d+.\\\\d+ secs\\\\] \\\\d+K\\\\-\\\\>\\\\d+K\\\\(\\\\d+K\\\\), \\\\d+.\\\\d+ secs\\\\]";

and then access the groups to the values. In the above example, I added parenthesis around the first two items you want -- this tells the regex engine to capture the matching substrings. You will need to add more. As you are currently doing, you use Matcher.group() to get each group. Note that group 0 is always the entire match. The rest are numbered from 1 up, in order of their opening parens ( .

Splitting pattern based on Regular Expressions

Question

1 answers

solution1
1 ACCPTED 2012-07-26 13:33:12

Splitting pattern based on Regular Expressions

Question

1 answers

solution1 1 ACCPTED 2012-07-26 13:33:12

solution1
1 ACCPTED 2012-07-26 13:33:12