简体   繁体   English

awk-如果值匹配,则打印文件1和文件2

[英]awk - if values match then print file1 and file 2

I googled a lot my problem and tested different solutions, but none seem to work. 我用谷歌搜索了很多我的问题并测试了不同的解决方案,但是似乎都没有用。 I even used the same command in advance with success but now I do not manage to get my desired output. 我什至已经成功地使用了相同的命令,但是现在我无法获得想要的输出。

I have file1 我有档案1

AAA;123456789A
BBB;123456789B
CCC;123456789C

And file2 和file2

1;2;3;CCC;pippo
1;2;3;AAA;pippo
1;2;3;BBB;pippo
1;2;3;*;pippo

My desired output is this: 我想要的输出是这样的:

1;2;3;CCC;pippo;CCC;123456789C
1;2;3;AAA;pippo;AAA;123456789A
1;2;3;BBB;pippo;BBB;123456789B

I tried with this command: 我尝试使用以下命令:

awk -F";" -v OFS=";" 'FNR == NR {a[$10]=$1; b[$20]=$2; next}($10 in a){ if(match(a[$10],$4)) print $0,a[$10],b[$20]}' file1 file2

But I get this output (only one entry, even with bigger files): 但是我得到以下输出(即使有更大的文件,也只有一个条目):

1;2;3;CCC;pippo;CCC;123456789C

What am I doing wrong? 我究竟做错了什么? If it manages for one it should for all the other. 如果它为一个管理,则应为所有其他管理。 Why is this not happening? 为什么这没有发生? Also why if I set a[$1]=$1 it doesn't work? 另外,为什么我将a[$1]=$1设置a[$1]=$1无效?
Thank you for helping! 感谢您的帮助! If possible could you explain the answer? 如果可能的话,您能否解释答案? So next time I won't have to ask for help! 因此,下次我将无需寻求帮助!

EDIT: Sorry, I did not mention (since I wanted to keep the example minimal) that in file2 some fields are just "*". 编辑:对不起,我没有提到(因为我想使示例保持最小),在file2中某些字段只是“ *”。 And I'd like to add an "else doesn't match do something". 我想添加“其他不匹配的东西”。

awk to the rescue! 敬请解救!

$ awk 'BEGIN{FS=OFS=";"} 
     NR==FNR{a[$1]=$0;next} 
            {print $0,a[$4]}' file1 file2

1;2;3;CCC;pippo;CCC;123456789C
1;2;3;AAA;pippo;AAA;123456789A
1;2;3;BBB;pippo;BBB;123456789B

UPDATE: Based on the original input file it was only looking for exact match. 更新:根据原始输入文件,它只是在寻找完全匹配。 If you want to skip the entries where there is no match, you need to qualify the print block with $4 in a 如果要跳过没有匹配项的条目,则需要使用$4 in a限定打印块$4 in a

$ awk 'BEGIN{FS=OFS=";"} 
     NR==FNR{a[$1]=$0;next} 
     $4 in a{print $0,a[$4]}' file1 file2

join is made for this sort of thing: join这个由sort的事情:

$ join -t';' -1 4 -o1.{1..5} -o2.{1..2} <(sort -t';' -k4 file2) <(sort -t';' file1)

1;2;3;AAA;pippo;AAA;123456789A
1;2;3;BBB;pippo;BBB;123456789B
1;2;3;CCC;pippo;CCC;123456789C

The output is what you asked for except for the ordering of lines, which I assume isn't important. 输出是您要求的,除了行的顺序,我认为这并不重要。 The -o options to join are needed because you want the full set of fields; 需要join -o选项,因为您需要完整的字段集; you can try omitting it and you'll get the join field on the left a single time instead, which might also be fine. 您可以尝试省略它,而直接在左侧获得join字段,这也可能很好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM