简体   繁体   English

在多个日志文件中查找模式?

[英]Find patterns across multiple log files?

I have eight log files in the following format: 我有以下格式的八个日志文件:

log01: log01:

[Tue Feb 24 07:39:37 2015] *** MARK ***
[Tue Feb 24 07:40:38 2015] *** MARK ***
[Wed Feb 25 17:13:33 2015] *** MARK ***
[Wed Feb 25 17:14:09 2015] *** MARK ***
[Wed Feb 25 17:16:46 2015] *** MARK ***
[Wed Feb 25 17:17:48 2015] *** MARK ***
[Wed Feb 25 17:22:31 2015] *** MARK ***
[Wed Feb 25 19:10:36 2015] *** MARK ***
[Wed Feb 25 19:10:52 2015] *** MARK ***
[Wed Feb 25 19:11:08 2015] *** MARK ***
[Wed Feb 25 19:11:34 2015] *** MARK ***
[Wed Feb 25 19:12:00 2015] *** MARK ***
[Wed Feb 25 19:12:26 2015] *** MARK ***
[Wed Feb 25 19:13:17 2015] *** MARK ***
[Wed Feb 25 19:13:33 2015] *** MARK ***
[Wed Feb 25 19:15:05 2015] *** MARK ***
[Wed Feb 25 19:37:53 2015] *** MARK ***
[Wed Feb 25 19:38:19 2015] *** MARK ***
[Wed Feb 25 19:38:35 2015] *** MARK ***
[Wed Feb 25 23:08:47 2015] *** MARK ***
[Wed Feb 25 23:09:28 2015] *** MARK ***
[Wed Feb 25 23:11:55 2015] *** MARK ***
[Wed Feb 25 23:12:21 2015] *** MARK ***
[Wed Feb 25 23:12:52 2015] *** MARK ***
[Wed Feb 25 23:13:08 2015] *** MARK ***
...

log02: log02:

[Wed Feb 25 07:01:39 2015] *** MARK ***
[Wed Feb 25 17:13:49 2015] *** MARK ***
[Wed Feb 25 17:15:20 2015] *** MARK ***
[Wed Feb 25 17:16:47 2015] *** MARK ***
[Wed Feb 25 17:17:38 2015] *** MARK ***
[Wed Feb 25 17:19:56 2015] *** MARK ***
[Wed Feb 25 17:22:53 2015] *** MARK ***
[Wed Feb 25 19:10:47 2015] *** MARK ***
[Wed Feb 25 19:11:13 2015] *** MARK ***
[Wed Feb 25 19:11:34 2015] *** MARK ***
[Wed Feb 25 19:11:50 2015] *** MARK ***
[Wed Feb 25 19:12:11 2015] *** MARK ***
[Wed Feb 25 19:12:37 2015] *** MARK ***
[Wed Feb 25 19:12:53 2015] *** MARK ***
[Wed Feb 25 19:13:14 2015] *** MARK ***
[Wed Feb 25 19:13:40 2015] *** MARK ***
[Wed Feb 25 19:14:06 2015] *** MARK ***
[Wed Feb 25 19:14:22 2015] *** MARK ***
[Wed Feb 25 19:14:38 2015] *** MARK ***
[Wed Feb 25 19:38:30 2015] *** MARK ***
[Wed Feb 25 21:17:08 2015] *** MARK ***
[Wed Feb 25 23:08:56 2015] *** MARK ***
[Wed Feb 25 23:10:37 2015] *** MARK ***
[Wed Feb 25 23:11:08 2015] *** MARK ***
[Wed Feb 25 23:11:24 2015] *** MARK ***
[Wed Feb 25 23:12:20 2015] *** MARK ***
[Wed Feb 25 23:12:46 2015] *** MARK ***
...

Every log file is generated by an instance of the same program reading different sensors. 每个日志文件都是由读取不同传感器的同一程序的实例生成的。 A log entry is created if a sensor detects an issue. 如果传感器检测到问题,则会创建一个日志条目。 If every sensor detects an issue within about a minute, it indicates a global problem has occured. 如果每个传感器在大约一分钟内检测到问题,则表明已发生全局问题。 For example: 例如:

The log entries [Tue Feb 24 07:39:37 2015] *** MARK *** and [Tue Feb 24 07:40:38 2015] *** MARK *** from log01 do not corresponds to anything in log02 so this is not a global problem and can be ignored. [Tue Feb 24 07:39:37 2015] *** MARK ***的日志条目[Tue Feb 24 07:39:37 2015] *** MARK ***[Tue Feb 24 07:40:38 2015] *** MARK ***与log02中的任何内容都不对应,因此这不是一个全球性问题,可以忽略。 The log entries [Wed Feb 25 07:01:39 2015] *** MARK *** and [Wed Feb 25 21:17:08 2015] *** MARK *** in log02 can also be ignored. [Wed Feb 25 07:01:39 2015] *** MARK ***的日志条目[Wed Feb 25 07:01:39 2015] *** MARK ***[Wed Feb 25 21:17:08 2015] *** MARK ***也可以忽略。

However, entry [Wed Feb 25 19:10:36 2015] *** MARK *** in log01 and [Wed Feb 25 19:10:47 2015] *** MARK *** in log02 is within a minute so this indicates a global problem that lasts until entry [Wed Feb 25 19:15:05 2015] *** MARK *** in log01 and [Wed Feb 25 19:14:38 2015] *** MARK *** in log02. 但是, [Wed Feb 25 19:10:36 2015] *** MARK ***条目[Wed Feb 25 19:10:36 2015] *** MARK ***和log02中的[Wed Feb 25 19:10:47 2015] *** MARK *** [Wed Feb 25 19:10:36 2015] *** MARK *** [Wed Feb 25 19:10:47 2015] *** MARK ***在一分钟之内,所以表示持续到输入[Wed Feb 25 19:15:05 2015] *** MARK ***和log02 in [Wed Feb 25 19:14:38 2015] *** MARK ***的全局问题。 So I can conclude that from around 19:10 to 19:15 on Feb 25 something was wrong. 因此,我可以得出结论,从2月25日的19:10到19:15左右,出现了问题。

I'm looking for suggestions and tips on how to approach this problem, preferably by using UNIX utilities. 我正在寻找有关如何解决此问题的建议和技巧,最好使用UNIX实用程序。

You can try something like this: 您可以尝试如下操作:

#!/bin/bash

for n in $(awk -F' ' '{print $4;}' log01 | cut -c1-5)
do
    if (grep -q $n log02)
    then
        echo "Error on $n"
    fi
done
  • The command awk -F' ' '{print $4;}' log01 | cut -c1-5 命令awk -F' ' '{print $4;}' log01 | cut -c1-5 awk -F' ' '{print $4;}' log01 | cut -c1-5 extracts the hour ( hh:mm ) from your log01 file. awk -F' ' '{print $4;}' log01 | cut -c1-5log01文件中提取小时( hh:mm )。
  • grep -q $n log02 search for this hour and report and error if found it. grep -q $n log02搜索该小时并报告并发现错误。

I think the elegant way would be to use Perl to read all the files in and create an array of lists. 我认为,一种优雅的方法是使用Perl读取其中的所有文件并创建一个列表数组。 The first array would be indexed by the time rounded to the nearest minute, and you would push a 1 onto the list at a given time if you were reading from file01 and a 2 if you were reading from file02 . 第一阵列将由四舍五入至最接近的分钟的时间进行索引,并且你将推动一个1在给定的时间,如果你从读取到列表file012如果你从阅读file02 Then at the end, you would iterate through the first array looking for lists with length greater than one. 然后,最后,您将遍历第一个数组以查找长度大于一个的列表。 A Perl tag might help there. Perl标记可能对您有所帮助。

If you don't like Perl, you could put something clumsier together, like this. 如果您不喜欢Perl,则可以像这样使笨拙的东西放在一起。

Step 1: Choose a start time earlier than the earliest time you want. 步骤1:选择开始时间早于您想要的最早时间。

Step 2: Parse each file, outputting one line per minute of input data. 步骤2:解析每个文件,每分钟输入数据输出一行。 That line is either 0 or 1 depending whether there is a problem or not. 该行是0还是1,具体取决于是否存在问题。 One line per minute ensures that all the files line up matching the minute across all 8. 每分钟一行可确保所有文件排成一行,与所有8个文件中的分钟都匹配。

Step 3: Use paste to put all the 8 output files together like this: 步骤3:使用paste将所有8个输出文件放在一起,如下所示:

paste -d, file{1..8}

1,0,1,1,1,0,1,1
1,1,1,1,1,1,1,1
0,0,1,1,0,0,0,1
0,1,1,1,0,1,0,1
0,0,1,1,0,0,0,1
0,1,1,1,0,1,0,1
1,0,1,1,1,0,1,1
1,1,1,1,1,1,1,1

Step 4: Use awk to look for lines that add up to more than 1. 步骤4:使用awk查找加起来大于1的行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM