简体   繁体   English

正则表达式从两个不同的日志条目中提取捕获组

[英]Regex to extract capture groups from two different log entries

I have two types of logs (different formats) in one log file:我在一个日志文件中有两种类型的日志(不同格式):

  1. First Log Format:第一个日志格式:
2019-09-01 18:58:05,898 INFO  Thread: qtp1497973286-16 - com.xyz.soap
 <with additional stuff>
  <more stuff>
 <even morestuff>

timestamp:2019-09-01 18:58:05,898, level:INFO, thread:qtp1497973286-16, message:com.xyz.soap... <to the end of last line>

  1. Second Log Format:第二个日志格式:
2021-03-23 23:47:38.111:ERROR::main: Logging initialized @5687ms to org.eclipse.jetty.util.log.StdErrLog
WARNING: An illegal reflective access operation has occurred
More lines here

timestamp:2021-03-23 23:47:38.111, level:ERROR, thread:main, message:Logging... <to the end of last line>

I'm trying to find a regex pattern with a unified output of capture groups: timestamp, thread, level, message.我正在尝试找到具有统一的 output 捕获组的正则表达式模式: timestamp, thread, level, message.

for example,this pattern " almost " works for the first group:例如,这种模式“几乎”适用于第一组:

(?<timestamp>[^ ]* [^ ]*) (?<level>[^\s][A-Z]+)[\s]+(?<thread>\s.*) (?<message>[\s\S]*)$

And I'm using the amazing regex101 tool: https://regex101.com/r/AW9VKp/1我正在使用惊人的 regex101 工具: https://regex101.com/r/AW9VKp/1

I need to find a pattern that both log formats generate the same groups.我需要找到两种日志格式生成相同组的模式。

Ok found it:好的,找到了:

(?<timestamp>^\d{4}-.*\d{3})(?: |:)(?<level>[^\s][A-Z]+)(?:\s{2}Thread: |:{2})(?<thread>[^\s]+)(?: - | )(?<message>[\s\S]*)$

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM