繁体   English   中英

Java正则表达式以匹配多行

[英]Java Regex to match multiple lines

以下是应在其上应用正则表达式的数据的示例:

2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Filter -> Map (1/1) (824780055001546646d35df7a64cfe3c) switched from CANCELING to CANCELED.
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Try to restart or fail the job  (3064130e1dccead0b037f193d3699c3b) if no longer possible.
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Could not restart the job  (3064130e1dccead0b037f193d3699c3b) because the restart strategy prevented it.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 3064130e1dccead0b037f193d3699c3b.
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore  - Shutting down
2019-05-27 10:49:18,419 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Job 3064130e1dccead0b037f193d3699c3b reached globally terminal state FAILED.

基本上我要提取的是时间戳和错误消息:

对于一个实例:

TimeStamp               Error
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)

在这里,错误消息被分为多行,为此,我编写了如下的Java模式:

((?m)\\d{4}-[01]\\d-[0-3]\\d\\s[0-2]\\d((:[0-5]\\d)?){2}[\\s\\S]*ERROR[\\s\\S]*[ ]*at [\\s\\S]*)

但这会返回我文件的所有内容。

我应该怎么做才能使其工作,以便它也能给我多行错误消息。

尝试这个

((\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5})\sERROR.+?(?=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5}))

Explantion:

  • (\\d{4}-\\d{2}-\\d{2}\\s\\d{2}:\\d{2}:\\d{2},\\d{3,5}) -匹配时间戳
  • \\sERROR.+?(?=\\d{4}-\\d{2}-\\d{2}\\s\\d{2}:\\d{2}:\\d{2},\\d{3,5}) -进行非贪婪匹配,直到找到下一个时间戳(正向超前)
  • 我也想强调一点,在使用此正则表达式时,您将必须使用m选项进行多行匹配
  • 此匹配项将为您提供每个匹配项的嵌套组,例如[[log, timestamp],[log, timestamp]]

您的模式看起来不对,并且您还应该在点全部模式下使用模式,因为要捕获的堆栈跟踪部分可能跨越一行而不是一行。 我建议使用以下正则表达式模式:

\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} ERROR.*?(?=\bat\b)

这符合一个时间戳,其次是ERROR的内容,然后直到到达第一at

这是一个有效的测试脚本:

String input = "2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.\njava.lang.IllegalArgumentException: json can not be null or empty\n    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)";
String pattern = "\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3} ERROR.*?(?=\\bat\\b)";
Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(input);
if (m.find()) {
    System.out.println(m.group(0));
}

输出:

2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.
java.lang.IllegalArgumentException: json can not be null or empty

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM