[英]Unique count of a value in a zipped file based on other constraints on surrounding lines
I have a log file. 我有一个日志文件。
Has data like this: 有这样的数据:
Operation=ABC,
CustomerId=12,
..
..
..
Counters=qwe=1,wer=2,mbn=4,Hello=0,
----
Operation=CQW,
CustomerId=10,
Time=blah,
..
..
Counters=qwe=1,wer=2,mbn=4,Hello=0,jvnf=2,njfs=4
----
Operation=ABC,
CustomerId=12,
Metric=blah
..
..
Counters=qwe=1,wer=2,mbn=4,Hello=1, uisg=2,vieus=3
----
Operation=ABC,
CustomerId=12,
Metric=blah
..
..
Counters=qwe=1,wer=2,mbn=4,Hello:0, uisg=2,vieus=3
----
Now, I want to find all the unique CustomerIds where Operation=ABC and Hello=0 (in Counters). 现在,我想找到所有唯一的CustomerId,其中Operation = ABC和Hello = 0(在Counters中)。
All of this info is contained in .gz files in a directory. 所有这些信息都包含在目录中的.gz文件中。
So, here is what I've tried to just retrieve the number of times Operation=ABC and "Hello=0" appears in the lines near it. 因此,这就是我尝试检索Operation = ABC和“ Hello = 0”出现在其附近的行中的次数。
zgrep -A 20 "Operation=ABC" * | grep "Hello=0" | wc -l
This gave me the number of times that "Hello=0" was found for Operation=ABC. 这给了我为Operation = ABC找到“ Hello = 0”的次数。 (about 250) (大约250)
In order to get unique customer Ids, I tried this: 为了获得唯一的客户ID,我尝试了以下操作:
zgrep -A 20 "Operation=ABC" * | grep "Hello=0" -B 10 | grep "CustomerId" | uniq -c
This gave me no results. 这没有给我任何结果。 What am I getting wrong here? 我这是怎么了?
Actually, this works. 实际上,这可行。 I was just being impatient. 我只是不耐烦。
zgrep -A 20 "Operation=ABC" * | grep "Hello=0" -B 10 | grep "CustomerId" | uniq -c
You need NOT to use these many grep
and zgrep
we could do it within single awk
. 您不需要使用这么多的grep
和zgrep
我们可以在单个awk
完成它。
awk -F'=' '
/^--/{
if(val==3){
print value
}
val=value=""
}
/Operation=ABC/{
val++
}
/CustomerId/{
if(!a[$NF]++){
val++
}
}
/Hello=0/{
val++
}
{
value=(value?value ORS:"")$0
}
END{
if(val && value){
print value
}
}' <(gzip -dc input_file.gz)
Output will be as follows(tested from your sample only): 输出将如下所示(仅根据您的示例进行测试):
Operation=ABC,
CustomerId=12,
..
..
..
Counters=qwe=1,wer=2,mbn=4,Hello=0,
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.