简体   繁体   English

AIX grep获得awk结果

[英]AIX grep for an awk result

I have a text file, named "hosts.tbl": 我有一个名为“ hosts.tbl”的文本文件:

BILL RED
VAL YELLOW
STEVE YELLOW
TOM ORANGE
BILLY RED
VALERIE BLUE

I have a second file, named "details.tbl" which has each name above, multiple times(among various other details on each line). 我有第二个文件,名为“ details.tbl”,上面的每个名字都多次出现(在每行其他细节中)。 I need to count how many times each name appears within "details.tbl", and end up with something like this: 我需要计算每个名称出现在“ details.tbl”中的次数,并得出如下结果:

BILL RED 8
VAL YELLOW 16
STEVE YELLOW 9
TOM ORANGE 1
BILLY RED 2
VALERIE BLUE 30

As you can see, a normal "grep" for 'BILL' will give me both "BILL" and "BILLY". 如您所见,“ BILL”的正常“ grep”会同时给我“ BILL”和“ BILLY”。 Same for "VAL" and "VALERIE". 与“ VAL”和“ VALERIE”相同。 However, within the "details.tbl" file, each occurrence of each name is followed by "-C". 但是,在“ details.tbl”文件中,每个名称的每次出现后均带有“ -C”。 For example: 例如:

STEVE-C
STEVE-C
BILL-C
BILLY-C

I have tried: 我努力了:

awk {'print $1 " " $2 " "'} hosts.tbl|grep -c $1"-C" details.tbl
awk {'print $1 " " $2 " "'grep -c $1"-C" details.tbl} hosts.tbl

...and various other permutations of similar syntax, above...all dismal failures. ...以及类似语法的其他各种排列,尤其是所有惨淡的失败。 Clearly, I am a neophyte when it comes to shell commands in particular, and UNIX in general. 显然,对于外壳命令,尤其是UNIX,我是一个新手。 What am I missing, here? 我在想什么? I cannot find anything in the man pages about how to concatenate search criteria within grep, or how to pass only specific fields from awk along to grep. 我在手册页中找不到有关如何在grep中连接搜索条件或如何仅将特定字段从awk传递到grep的任何内容。

Assuming the applicable portion of the details.tbl file looks like this: 假设details.tbl文件的适用部分如下所示:

BILL-C
VAL-C
STEVE-C
TOM-C
BILLY-C
VALERIE-C
BILL-C
VAL-C
STEVE-C
TOM-C
BILLY-C
VALERIE-C

The output should look like this: 输出应如下所示:

BILL RED 2
VAL YELLOW 2
STEVE YELLOW 2
TOM ORANGE 2
BILLY RED 2
VALERIE BLUE 2

cat hosts.tbl 猫hosts.tbl

BILL RED
VAL YELLOW
STEVE YELLOW
TOM ORANGE
BILLY RED
VALERIE BLUE

cat details.tbl 猫的详细信息

BILL RED
VAL YELLOW
STEVE YELLOW
TOM ORANGE
BILLY RED
VALERIE BLUE
BILL RED
VAL YELLOW
STEVE YELLOW
TOM ORANGE
BILLY RED
VALERIE BLUE
BILL RED
VAL YELLOW
STEVE YELLOW
TOM ORANGE

with awk command, we get the name from 1st file and store in array a, from 2nd file we match if the name is present and if it is, the count is incremented 使用awk命令,我们从第一个文件中获取名称,并将其存储在数组a中,如果存在该名称,则从第二个文件中进行匹配;如果存在,则计数增加

awk 'FILENAME == ARGV[1]{a[$0]=0;next} FILENAME == ARGV[2] && $0 in a{a[$0]+=1} END
{for(i in a){print i,a[i]}} ' hosts.tbl  details.tbl

Output 产量

VALERIE BLUE 2
BILLY RED 2
BILL RED 3
VAL YELLOW 3
TOM ORANGE 3
STEVE YELLOW 3

When you ignore https://unix.stackexchange.com/a/169765/57293 you can make a solution like 当您忽略https://unix.stackexchange.com/a/169765/57293时,您可以提出类似的解决方案

while read -r name lastname ; do
   printf "%s %s %s\n" ${name} ${lastname} $(grep -c "${name}-C" details.tbl)
done < hosts.tbl

When you use awk, you should first process details.tbl and count the lines. 使用awk时,应首先处理details.tbl并计算行数。 Processing 2 files differently in one awk-script is explained at What is "NR==FNR" in awk? awk中的“ NR == FNR”是什么解释了在一个awk脚本中以不同方式处理2个文件的问题 .
You want to ignore the -C , you can preprocess the inputfile with cut like this: 您想忽略-C ,可以使用cut预处理输入文件,如下所示:

awk 'NR==FNR {a[$0]++;next} {
       for(i in a) {
         if ($1==i) {
           print $0, a[i]
         }
       }
     }' <(cut -d"-" -f1<details.tbl) hosts.tbl

awk is smart, the preprocessing with cut is not needed: awk很聪明,不需要使用cut进行预处理:

awk -F '[ -]' 'NR==FNR {a[$1]++; next} {
       for(i in a) {
         if ($1==i) {
           print $0, a[i]
         }
       }
     }' details.tbl hosts.tbl

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM