简体   繁体   English

Perl解析Apache日志

[英]Perl Parsing Apache Log

I was trying to parse an apache log, but I am unable to figure out the exact regex for doing it 我试图解析apache日志,但是我无法找出执行此操作的确切正则表达式

use strict;
use warnings;

my $log_line =
'178.255.215.79 - - [14/Jul/2013:03:27:51 -0400] 
"GET /~hines/ringworld_config/lilo.conf HTTP/1.1" 304 - "-" 
"Mozilla/5.0 (compatible; Exabot/3.0; +http://www.exabot.com/go/robot)';
#to find out IP address
print( $log_line =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/ );
#to find out Timestamp
print( $log_line =~ /\[[\d]{2}\/.*\/[\d]{4}\:[\d]{2}\:[\d]{2}\]*/ );

#Third regex for getting the complete link here :/~hines/ringworld_config/lilo.conf

What am I doing wrong in second regex cause I keep getting only 1 in it? 我在第二个正则表达式中做错了什么,因为我只得到1个? How to create an regex for the third requirement? 如何为第三个需求创建一个正则表达式?

Finally I want to convert the Timestamp after retrieval to some values which I can compare and subtract . 最后,我想将检索后的时间戳转换为一些可以比较和减去的值。 Like the Timestamp to seconfs from epoch conversion. 像时间戳一样,从时代转换开始。

The second regex (timestamp) looks to be something like this: 第二个正则表达式(时间戳)看起来像这样:

m~\\[\\d{2}/[^/]*/\\d{4}:\\d{2}:\\d{2}:\\d{2}\\s*-\\d+\\]~

expanded: 扩展:

m~\\[ \\d{2} / [^/]* / \\d{4} : \\d{2} : \\d{2} : \\d{2} \\s* - \\d+ \\]~x

with capture groups 与捕获组

m~\\[ (\\d{2}) / ([^/]*) / (\\d{4}) : (\\d{2}) : (\\d{2}) : (\\d{2}) \\s* - (\\d+) \\]~x


The third regeex (link) maybe something like this: 第三个regeex(链接)可能是这样的:

modified link regex 修改后的链接正则表达式

m/"GET\\s+([^"\\s]*)\\s*"/ where capture group 1 contains the link. m/"GET\\s+([^"\\s]*)\\s*"/ ,其中捕获组1包含链接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM