[英]Splitting pattern based on Regular Expressions
I am trying to write a program to parse Java Garbage Collection logs. 我正在尝试编写一个程序来解析Java垃圾收集日志。 I just created a grammar that matches a minor collection.
我刚刚创建了与次要集合匹配的语法。 Once I have identified a pattern I would like to parse it into individual tokens.
确定了模式后,我想将其解析为单独的标记。 My question is, is there any elegant way to do this with my previously defined grammar?
我的问题是,用我之前定义的语法有什么优雅的方法吗?
public class RegexTestHarness {
private final static String REGEX_SMALL_COLLECTION = "\\d+\\.\\d+: \\[GC \\d+.\\d+: \\[ParNew: \\d+K\\-\\>0K\\(\\d+K\\), \\d+.\\d+ secs\\] \\d+K\\-\\>\\d+K\\(\\d+K\\), \\d+.\\d+ secs\\]";
public static void main(String[] args){
Pattern pattern = Pattern.compile(REGEX_SMALL_COLLECTION);
Matcher matcher = pattern.matcher("54.770: [GC 54.770: [ParNew: 5232768K->0K(5237824K), 1.1304192 secs] 5238622K->380448K(10480704K), 1.1306410 secs]");
while (matcher.find()) {
System.out.println(matcher.group(0));
System.out.println(matcher.start());
System.out.println(matcher.end());
}
}
}
You need to add groups to your regex. 您需要将组添加到正则表达式。
private final static String REGEX_SMALL_COLLECTION = "(\\\\d+\\\\.\\\\d+): \\\\[GC (\\\\d+.\\\\d+): \\\\[ParNew: \\\\d+K\\\\-\\\\>0K\\\\(\\\\d+K\\\\), \\\\d+.\\\\d+ secs\\\\] \\\\d+K\\\\-\\\\>\\\\d+K\\\\(\\\\d+K\\\\), \\\\d+.\\\\d+ secs\\\\]";
and then access the groups to the values. 然后访问组中的值。 In the above example, I added parenthesis around the first two items you want -- this tells the regex engine to capture the matching substrings.
在上面的示例中,我在想要的前两个项目周围添加了括号-告诉正则表达式引擎捕获匹配的子字符串。 You will need to add more.
您将需要添加更多。 As you are currently doing, you use
Matcher.group()
to get each group. 当前,您使用
Matcher.group()
获取每个组。 Note that group 0 is always the entire match. 请注意,组0始终是整个匹配项。 The rest are numbered from
1
up, in order of their opening parens (
. 其余的从
1
开始按编号顺序排列(
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.