解析Apache通用日志格式日志文件

Question

I'm trying to take three pieces of information from a common log format log file. 我正在尝试从常见的日志格式日志文件中获取三项信息。 An entry of the log file would be: 日志文件的条目为：

65.54.188.137 - - [03/Oct/2007:02:20:22 -0400] "GET /~longa/statistics/code/xlispstat/smoothers/spline/ HTTP/2.0" 301 2633

and from that, I want to store the number of occurrences of the IP, the URLs, and the status codes in a hash. 然后，我想将IP，URL和状态代码的出现次数存储在哈希中。 I figured they each have to be in their own. 我认为他们每个人都必须独立存在。 Any help would be appreciated, even if you can just point me in the right direction. 任何帮助都将不胜感激，即使您可以指出正确的方向。

Answer 1

You can read the information from the log entries with a regular expression. 您可以使用正则表达式从日志条目中读取信息。 Something like this: 像这样：

lines.each do |line|
  matches = /^(\S+).*GET\s(.*)\sHTTP\S*\s(\d+)/.match(line)
  ip = matches[1]
  url = matches[2]
  status = matches[3]
do

Then you can put this information into a hash and process it how you like. 然后，您可以将该信息放入哈希表中，并按自己的意愿进行处理。

解析Apache通用日志格式日志文件

问题描述

1 个解决方案

解决方案1
1 2013-10-22 10:18:55

解析Apache通用日志格式日志文件

问题描述

1 个解决方案

解决方案1 1 2013-10-22 10:18:55

解决方案1
1 2013-10-22 10:18:55