[英]How do I split a filename using Logstash Grok?
One of these days I'll learn regex. 有一天我会学习正则表达式。
I have the following filename 我有以下文件名
PE-run1000hbgmm3f1-job1000hbgmm3dt-Output-Workflow-1000hbgmm3fb-22.07.17.log
I'm able to get this to work so... 我能够让这个工作如此......
(?<logtype>[^-]+)-(?<run_id>[^-]+)-(?<job_id>[^-]+)-(?<capability>[^(0-9\.0-9\.0-9)]+)
logtype: PE
run_id: run1000hbgmm3f1
job_id: job1000hbgmm3dt
But I'm getting 但我得到了
capability: Output-Workflow-
...though I want it to be ......虽然我想要它
capability: Output-Workflow-1000hbgmm3fb
...that is, all the text after the job_id up to the timestamp HH.mm.ss. ...即job_id之后的所有文本直到时间戳HH.mm.ss. Any help please?
有什么帮助吗? Thanks!
谢谢!
It is because you cannot negate a sequence of symbols with a negated character class. 这是因为你不能否定一个带有否定字符类的符号序列 。
[^(0-9\\.0-9\\.0-9)]
matches any single char other than (
, digit, .
and )
. [^(0-9\\.0-9\\.0-9)]
匹配以外的任何单个字符(
,数字.
和)
。
You may replace your (?<capability>[^(0-9\\.0-9\\.0-9)]+)
with (?<capability>.*?)-\\d{2}\\.\\d{2}\\.\\d{2}
to get the right value. 您可以将
(?<capability>[^(0-9\\.0-9\\.0-9)]+)
替换为(?<capability>.*?)-\\d{2}\\.\\d{2}\\.\\d{2}
以获得正确的值。
Now, the (?<capability>.*?)-\\d{2}\\.\\d{2}\\.\\d{2}
will match any 0+ chars (and capture them into "capability" group) as few as possible (since the *?
is a lazy quantifier) up to the first occurrence of -
, followed with 2 digits, and then 3 sequences of a dot ( \\.
) followed with 2 digits. 现在,
(?<capability>.*?)-\\d{2}\\.\\d{2}\\.\\d{2}
将匹配任何0+字符(并将它们捕获到“功能”组)中尽可能(因为*?
是一个惰性量词),直到第一次出现-
,然后是2位数,然后是3个点的序列( \\.
),后跟2位数。
See the regex demo at regex101.com. 请参阅regex101.com上的正则表达式演示 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.