[英]Fluentd: - problem with regex while parsing log
我有這個fluentd
配置:
<source>
@type tail
<parse>
@type regexp
expression /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \"(?<method>\w+) (?<path>[^ ]*) (?<http>[^ ]*)" (?<status_code>[^ ]*) (?<size>[^ ]*)(?:\s"(?<referer>[^\"]*)") "(?<agent>[^\"]*)" (?<urt>[^\"]*).*/
time_format %d/%b/%Y:%H:%M:%S %z
keep_time_key true
types size:integer,reqtime:float,uct:float,uht:float,urt:float
</parse>
path /var/log/nginx/access.log
pos_file /tmp/fluent_nginx.pos
tag nginx
</source>
我的日志格式:
193.137.78.17 - - [07/Jan/2023:09:21:59 +0000] "GET /net/api/employee HTTP/1.1" 200 2323 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" 0.014
193.137.78.17 - - [07/Jan/2023:09:22:00 +0000] "GET /net/api/employee HTTP/1.1" 200 2323 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" 0.005
我已經在 regex101 上測試了我的正則表達式並且沒有問題。 不過,我在 fluentd 上收到了沒有模式匹配的警告。 我不明白為什么日志沒有被正確解析。
Jan 07 09:26:26 srv-api fluentd[14878]: 2023-01-07 09:26:26 +0000 [warn]: #0 no patterns matched tag="nginx"
任何人都可以幫助我嗎? 謝謝!
您的模式堅持<remote>
之前沒有空格,但您的日志中在遠程 IP 之前確實有 4 個空格。
在我看來,最簡單的方法是在開頭插入一個可選的可變數量的空格。
^( )*(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \"(?<method>\w+) (?<path>[^ ]*) (?<http>[^ ]*)" (?<status_code>[^ ]*) (?<size>[^ ]*)(?:\s"(?<referer>[^\"]*)") "(?<agent>[^\"]*)" (?<urt>[^\"]*).*
(
和)
只是為了讓閱讀代碼的人更輕松:他們會看到它們之間有一個空格字符,否則他們可能不會注意到。
*
表示 0 個或多個。
這允許匹配和丟棄行開頭的 0 個或更多空格。
我注意到你有時 escaping "
有\
有時沒有。這是有原因的嗎?
您應該直接使用nginx 解析器插件。
這是一個完整的示例輸入插件和nginx解析器插件的工作示例:
流利的-nginx-test.conf
<source>
@type sample
sample [
{ "message": "193.137.78.17 - - [07/Jan/2023:09:22:00 +0000] \"GET /net/api/employee HTTP/1.1\" 200 2323 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36\" 0.005" },
{ "message": "193.137.78.18 - - [07/Jan/2023:09:22:00 +0000] \"GET /net/api/employee HTTP/1.1\" 200 2323 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36\" 0.005" }
]
rate 1
size 2
tag nginx
</source>
<filter nginx>
@type parser
key_name message
<parse>
@type nginx
</parse>
</filter>
<match nginx>
@type stdout
</match>
跑步
$ fluentd -c ./fluent-nginx-test.conf
Output
2023-01-07 14:22:00.000000000 +0500 nginx: {"remote":"193.137.78.17","host":"-","user":"-","method":"GET","path":"/net/api/employee","code":"200","size":"2323","referer":"-","agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36","http_x_forwarded_for":"0.005"}
2023-01-07 14:22:00.000000000 +0500 nginx: {"remote":"193.137.78.18","host":"-","user":"-","method":"GET","path":"/net/api/employee","code":"200","size":"2323","referer":"-","agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36","http_x_forwarded_for":"0.005"}
除此之外,我將您的正則表達式與正則表達式解析器插件一起使用,它也工作正常(盡管types
字段中有冗余值):
流利的 nginx-test-with-regexp.conf
<source>
@type sample
sample [
{ "message": "193.137.78.17 - - [07/Jan/2023:09:22:00 +0000] \"GET /net/api/employee HTTP/1.1\" 200 2323 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36\" 0.005" },
{ "message": "193.137.78.18 - - [07/Jan/2023:09:22:00 +0000] \"GET /net/api/employee HTTP/1.1\" 200 2323 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36\" 0.005" }
]
rate 1
size 2
tag nginx
</source>
<filter nginx>
@type parser
key_name message
<parse>
@type regexp
expression /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \"(?<method>\w+) (?<path>[^ ]*) (?<http>[^ ]*)" (?<status_code>[^ ]*) (?<size>[^ ]*)(?:\s"(?<referer>[^\"]*)") "(?<agent>[^\"]*)" (?<urt>[^\"]*).*/
time_format %d/%b/%Y:%H:%M:%S %z
keep_time_key true
types size:integer,reqtime:float,uct:float,uht:float,urt:float
</parse>
</filter>
<match nginx>
@type stdout
</match>
跑步
$ fluentd -c ./fluent-nginx-test-with-regexp.conf
Output
2023-01-07 14:22:00.000000000 +0500 nginx: {"remote":"193.137.78.17","host":"-","user":"-","time":"07/Jan/2023:09:22:00 +0000","method":"GET","path":"/net/api/employee","http":"HTTP/1.1","status_code":"200","size":2323,"referer":"-","agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36","urt":0.005}
2023-01-07 14:22:00.000000000 +0500 nginx: {"remote":"193.137.78.18","host":"-","user":"-","time":"07/Jan/2023:09:22:00 +0000","method":"GET","path":"/net/api/employee","http":"HTTP/1.1","status_code":"200","size":2323,"referer":"-","agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36","urt":0.005}
但是,消息中no patterns matched tag="nginx"
的錯誤:
Jan 07 09:26:26 srv-api fluentd[14878]: 2023-01-07 09:26:26 +0000 [warn]: #0 no patterns matched tag="nginx"
這意味着您的配置文件中沒有相應的match
部分。 您必須有一個match
部分,其中包含您要處理的相應tag
或 output。
例子:
<source>
@type tail
# ...
tag nginx
</source>
# ...
<match nginx>
@type stdout
</match>
此外,您可能希望使用vscode-fluentd擴展來通過VS Code進行語法高亮顯示。
環境
fluentd
$ fluentd --version
fluentd 1.12.3
$ lsb_release -d
Description: Ubuntu 18.04.6 LTS
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.