简体   繁体   English

基于正则表达式的分割模式

[英]Splitting pattern based on Regular Expressions

I am trying to write a program to parse Java Garbage Collection logs. 我正在尝试编写一个程序来解析Java垃圾收集日志。 I just created a grammar that matches a minor collection. 我刚刚创建了与次要集合匹配的语法。 Once I have identified a pattern I would like to parse it into individual tokens. 确定了模式后,我想将其解析为单独的标记。 My question is, is there any elegant way to do this with my previously defined grammar? 我的问题是,用我之前定义的语法有什么优雅的方法吗?

public class RegexTestHarness {
  private final static String REGEX_SMALL_COLLECTION = "\\d+\\.\\d+: \\[GC \\d+.\\d+: \\[ParNew: \\d+K\\-\\>0K\\(\\d+K\\), \\d+.\\d+ secs\\] \\d+K\\-\\>\\d+K\\(\\d+K\\), \\d+.\\d+ secs\\]";

  public static void main(String[] args){
    Pattern pattern = Pattern.compile(REGEX_SMALL_COLLECTION);           
    Matcher matcher = pattern.matcher("54.770: [GC 54.770: [ParNew: 5232768K->0K(5237824K), 1.1304192 secs] 5238622K->380448K(10480704K), 1.1306410 secs]");
    while (matcher.find()) {            
      System.out.println(matcher.group(0));
      System.out.println(matcher.start());
      System.out.println(matcher.end()); 
    }
  }
}

You need to add groups to your regex. 您需要将组添加到正则表达式。

private final static String REGEX_SMALL_COLLECTION = "(\\\\d+\\\\.\\\\d+): \\\\[GC (\\\\d+.\\\\d+): \\\\[ParNew: \\\\d+K\\\\-\\\\>0K\\\\(\\\\d+K\\\\), \\\\d+.\\\\d+ secs\\\\] \\\\d+K\\\\-\\\\>\\\\d+K\\\\(\\\\d+K\\\\), \\\\d+.\\\\d+ secs\\\\]";

and then access the groups to the values. 然后访问组中的值。 In the above example, I added parenthesis around the first two items you want -- this tells the regex engine to capture the matching substrings. 在上面的示例中,我在想要的前两个项目周围添加了括号-告诉正则表达式引擎捕获匹配的子字符串。 You will need to add more. 您将需要添加更多。 As you are currently doing, you use Matcher.group() to get each group. 当前,您使用Matcher.group()获取每个组。 Note that group 0 is always the entire match. 请注意,组0始终是整个匹配项。 The rest are numbered from 1 up, in order of their opening parens ( . 其余的从1开始按编号顺序排列(

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM