简体   繁体   English

使用RegEX解析Apache日志时出错?

[英]Error in Parsing Apache Log with RegEX?

I am parsing following apache log entry 我正在解析以下Apache日志条目

59.167.203.103 - - [28/May/2013:03:12:47 +0000] "POST /some/some.htm HTTP/1.1" 200 1187 "-" "xyzf/2.00.16 xyzNetwork/609.1.4 xyzwin/13.0.0"

with given below RegEx and its working fine. 下面给出RegEx及其工作正常。

String logentrypattern = "^([\\d.]+) (\\S+) (\\S+) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\d+) \"([^\"]+)\" \"([^\"]+)\"";

But in few entries responsebytes are "-" instead of some value, this is giving me following erorr and saying unable to parse. 但是在很少的条目中, responsebytes是“-”而不是某些值,这给了我erorr并说无法解析。 plz help 请帮助

Bad log entry (or problem with RE?):
89.178.46.54 - - [24/May/2013:17:04:59 +0000] "PUT /xyz-pmp/xyz-pmp.htm HTTP/1.1" 200 - "-" "kdm/1.0"

You could try this: 您可以尝试以下方法:

^([\\d.]+) (\\S+) (\\S+) \\[([\\w:\/]+\\s[+\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\d+|-) \"([^\"]+)\" \"([^\"]+)\"
                                                                                 ^^

I added the bit where you can have a dash. 我在可以加破折号的地方加了一点。 Maybe it'd be better for you to have a \\\\S+ block instead there? 也许最好在这里有一个\\\\S+块? Well, it'll all depend on what you're doing exactly. 好吧,这完全取决于您在做什么。 If the intent is to accept only the entries with digits, then your regex is working as intended. 如果目的是仅接受带有数字的条目,则您的正则表达式可以正常工作。 If it's just to capture the different parts of the entries, make sure you know the structure of the data and the different forms they can come to you. 如果仅是捕获条目的不同部分,请确保您知道数据的结构以及它们可以使用的不同形式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM