简体   繁体   中英

Splitting pattern based on Regular Expressions

I am trying to write a program to parse Java Garbage Collection logs. I just created a grammar that matches a minor collection. Once I have identified a pattern I would like to parse it into individual tokens. My question is, is there any elegant way to do this with my previously defined grammar?

public class RegexTestHarness {
  private final static String REGEX_SMALL_COLLECTION = "\\d+\\.\\d+: \\[GC \\d+.\\d+: \\[ParNew: \\d+K\\-\\>0K\\(\\d+K\\), \\d+.\\d+ secs\\] \\d+K\\-\\>\\d+K\\(\\d+K\\), \\d+.\\d+ secs\\]";

  public static void main(String[] args){
    Pattern pattern = Pattern.compile(REGEX_SMALL_COLLECTION);           
    Matcher matcher = pattern.matcher("54.770: [GC 54.770: [ParNew: 5232768K->0K(5237824K), 1.1304192 secs] 5238622K->380448K(10480704K), 1.1306410 secs]");
    while (matcher.find()) {            
      System.out.println(matcher.group(0));
      System.out.println(matcher.start());
      System.out.println(matcher.end()); 
    }
  }
}

You need to add groups to your regex.

private final static String REGEX_SMALL_COLLECTION = "(\\\\d+\\\\.\\\\d+): \\\\[GC (\\\\d+.\\\\d+): \\\\[ParNew: \\\\d+K\\\\-\\\\>0K\\\\(\\\\d+K\\\\), \\\\d+.\\\\d+ secs\\\\] \\\\d+K\\\\-\\\\>\\\\d+K\\\\(\\\\d+K\\\\), \\\\d+.\\\\d+ secs\\\\]";

and then access the groups to the values. In the above example, I added parenthesis around the first two items you want -- this tells the regex engine to capture the matching substrings. You will need to add more. As you are currently doing, you use Matcher.group() to get each group. Note that group 0 is always the entire match. The rest are numbered from 1 up, in order of their opening parens ( .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM