简体   繁体   English

解析Apache通用日志格式日志文件

[英]Parse apache common log format log files

I'm trying to take three pieces of information from a common log format log file. 我正在尝试从常见的日志格式日志文件中获取三项信息。 An entry of the log file would be: 日志文件的条目为:

65.54.188.137 - - [03/Oct/2007:02:20:22 -0400] "GET /~longa/statistics/code/xlispstat/smoothers/spline/ HTTP/2.0" 301 2633

and from that, I want to store the number of occurrences of the IP, the URLs, and the status codes in a hash. 然后,我想将IP,URL和状态代码的出现次数存储在哈希中。 I figured they each have to be in their own. 我认为他们每个人都必须独立存在。 Any help would be appreciated, even if you can just point me in the right direction. 任何帮助都将不胜感激,即使您可以指出正确的方向。

You can read the information from the log entries with a regular expression. 您可以使用正则表达式从日志条目中读取信息。 Something like this: 像这样:

lines.each do |line|
  matches = /^(\S+).*GET\s(.*)\sHTTP\S*\s(\d+)/.match(line)
  ip = matches[1]
  url = matches[2]
  status = matches[3]
do

Then you can put this information into a hash and process it how you like. 然后,您可以将该信息放入哈希表中,并按自己的意愿进行处理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM