简体   繁体   English

Grep 基于日期的特定元素总量

[英]Grep total amount of specific elements based on date

Is there a way in linux to filter multiple files with bunch of data in one command without writing a script? linux 中有没有一种方法可以在编写脚本的情况下在一个命令中过滤包含一堆数据的多个文件?

For this example I want to know how many males appear by date.对于这个例子,我想知道有多少男性按日期出现。 Also the problem is that a specific date (January 3rd) appears in 2 seperate files:问题还在于特定日期(1 月 3 日)出现在 2 个单独的文件中:

file1文件1

Jan  1 john male=yes
Jan  1 james male=yes
Jan  2 kate male=no 
Jan  3 jonathan male=yes

file2文件2

Jan  3 alice male=no
Jan  4 john male=yes 
Jan  4 jonathan male=yes
Jan  4 alice male=no

I want the total amount of males for each date from all files.我想要所有文件中每个日期的男性总数。 If there are no males for a specific date, no output will be given.如果特定日期没有男性,则不会给出 output。

Jan  1 2 
Jan  3 1
Jan  4 2

The only way I can think of is count the total amount of male genders given a specific date, but this would not performant as in real-world examples there could be much more files and manually entering all the dates would be a waste of time.我能想到的唯一方法是计算给定特定日期的男性性别总数,但这不会有效,因为在现实世界的示例中可能会有更多的文件,手动输入所有日期会浪费时间。 Any help would be appreciated, thank you!任何帮助将不胜感激,谢谢!

localhost:~# cat file1 file2 | grep "male=yes" | grep "Jan  1" | wc -l
2
grep -h 'male=yes' file? | \
    cut -c-6 | \
    awk '{c[$0] += 1} END {for(i in c){printf "%6s %4d\n", i, c[i]}}'

The grep will print the male lines, cut will remove everything but the first 6 chars (date) and awk will count every date and printout every date and the counter in the end. grep将打印男性行, cut将删除除前 6 个字符(日期)之外的所有内容, awk将计算每个日期并打印输出每个日期和最后的计数器。

Given your files the output will be:鉴于您的文件,output 将是:

Jan  1    2
Jan  3    1
Jan  4    2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM