從日志中提取差異

Question

抱歉，如果這是一個簡單的shell編程問題，但是我找不到一種完全可以實現我想要的方法的方法。

我有2條日志。

log1

Cust   Subsys    StartDate   EndDate        col1      col2      col3
1001   10000      20150501    20150731      6.1700    0.0000   0.0000   -- this line is identical in both logs 
1001   12000      20150401    20150630      0.0000    0.0000   0.0000   -- this line is missing from log2
1003   13000      20150310    20150630      2.4800    0.0000   0.0000   -- the value in log1.col1 is different from the one in log2.col1

和log2

Cust   Subsys    StartDate    EndDate        col1      col2      col3
1001   10000      20150501    20150731      6.1700    0.0000   0.0000   -- this line is identical in both logs 
1003   13000      20150310    20150630      9.1800    0.0000   0.0000   -- the value in log1.col1 is different from the one in log2.col1
7000   7777       20150406    20150413      4.3300    0.0000   0.0000   -- this line is missing from log1

我想從這些日志中生成3個報告：

在log1找到的行，但在log2找不到
在log2找到的行，但在log1找不到
在log1和log2的前4列中相同的行，但在col1 ， col2或col3列上具有不同的值。

我在所有列上對兩個日志進行了排序：

cat log1 |sort -n -k1,1 -k2,2r -k3,3 -k4,4 -k5,5 -k6,6 -k7,7 > log1.sorted
cat log2 |sort -n -k1,1 -k2,2r -k3,3 -k4,4 -k5,5 -k6,6 -k7,7 > log2.sorted

然后，我嘗試使用comm生成前2個報告：

comm -13  log1.sorted  log2.sorted  > unique2.log
comm -23  log1.sorted  log2.sorted  > unique1.log

而且我注意到在unique1.log中有unique1.log行可以在log1和log2找到。 （我的日志中每個日志都有20.000行以上）是否不使用comm來提取不在其中一個日志中的行？ 僅當行號相同時才有效嗎？ （在unique1.log找到的unique1.log log1為188 ，在log2 207 ）

我如何才能提取僅在col1 col2或col3中具有不同值的行的第三個報告的數據？

謝謝

Answer 1

嘗試以這種方式使用

comm -23 sort_file1 sort_file2 unique_in_file1
comm -13 sort_file1 sort_file2 unique_in_file2
comm -12 sort_file1 sort_file2 common_entries

如果這對您不起作用，請嘗試查看tge diff命令。

從日志中提取差異

問題描述

1 個解決方案

解決方案1
0 2015-04-09 11:00:29

從日志中提取差異

問題描述

1 個解決方案

解決方案1 0 2015-04-09 11:00:29

解決方案1
0 2015-04-09 11:00:29