比较不同列的两个文件并打印不同列

Question

I want to compare 2nd column of file2 with 1st column of file1 . 我想将file2的第二列与file1的第一列进行比较。 If they are equal i want to add the 2nd column of file1 to file2 as shown in output.txt . 如果它们相等，我想将file1的第二列添加到file2 ，如output.txt所示。

file2 文件2

chr5    ENST00000514151    utr5    0    +
chr5    ENST00000512281    utr5    0    +
chr5    ENST00000512281    utr5    0    +
chr5    ENST00000512281    utr5    0    +

file1 文件1

ENST00000512281    a
ENST00000504031    b
ENST00000776348    c

output.txt output.txt的

chr5    a    ENST00000512281    utr5    0    +
chr5    a    ENST00000512281    utr5    0    +
chr5    a    ENST00000512281    utr5    0    +

I was able compare the files with 我可以比较文件

awk 'NR==FNR{a[$1];next}$2 in a{print}' file1 file2

This gives below output: 这给出以下输出：

chr5    ENST00000512281    utr5    0    +
chr5    ENST00000512281    utr5    0    +
chr5    ENST00000512281    utr5    0    +

But I do not know how to add the 2nd colum of file1 into the output. 但是我不知道如何将file1的第二列添加到输出中。

Answer 1

You can store the value of $2 in file1 into the array using a[$1]=$2 . 您可以使用a[$1]=$2 file1中$2的值存储到数组中。 So you could try: 因此，您可以尝试：

awk '
   NR==FNR{ 
     a[$1]=$2 ; next }
   $2 in a {
     $1=$1 FS a[$2]
     print 
   }' file1 file2

Output: 输出：

chr5 b ENST00000504031 utr5 0 +
chr5 b ENST00000504031 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
chr5 a ENST00000512281 utr5 0 +

Explanation: 说明：

This modifies $1 in file2 using $1=$1 FS a[$2] where FS is the default field separator, which is a space.. and then rebuilds the record, such that it can be printed by print later.. 这个修改$1在file2使用$1=$1 FS a[$2]其中FS是默认字段分隔符，这是一个空间..然后重建的记录，以使得它可以通过被打印print后..
The print can be simplified to a 1 if desired.. Like $2 in a { $1=$1 FS a[$2] }1 如果需要，可以将print简化为1 $2 in a { $1=$1 FS a[$2] }1
Note that this rebuilds the record in file2 and thus any sequences of spaces or tabs will be truncated to a single space in the output. 请注意，这将重建file2的记录，因此任何空格或制表符序列都将被截断为输出中的单个空格。 To keep the original formatting in file2 one could use the split() function in Gnu Awk version 4.. 为了将原始格式保留在file2可以使用Gnu Awk版本4中的split()函数。

比较不同列的两个文件并打印不同列

问题描述

1 个解决方案

解决方案1
2 已采纳 2014-01-26 23:57:32

比较不同列的两个文件并打印不同列

问题描述

1 个解决方案

解决方案1 2 已采纳 2014-01-26 23:57:32

解决方案1
2 已采纳 2014-01-26 23:57:32