[英]Compare two files of different columns and print different columns
I want to compare 2nd column of file2 with 1st column of file1 . 我想将file2的第二列与file1的第一列进行比较。 If they are equal i want to add the 2nd column of file1 to file2 as shown in output.txt .
如果它们相等,我想将file1的第二列添加到file2 ,如output.txt所示。
file2 文件2
chr5 ENST00000514151 utr5 0 +
chr5 ENST00000512281 utr5 0 +
chr5 ENST00000512281 utr5 0 +
chr5 ENST00000512281 utr5 0 +
file1 文件1
ENST00000512281 a
ENST00000504031 b
ENST00000776348 c
output.txt output.txt的
chr5 a ENST00000512281 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
I was able compare the files with 我可以比较文件
awk 'NR==FNR{a[$1];next}$2 in a{print}' file1 file2
This gives below output: 这给出以下输出:
chr5 ENST00000512281 utr5 0 +
chr5 ENST00000512281 utr5 0 +
chr5 ENST00000512281 utr5 0 +
But I do not know how to add the 2nd colum of file1 into the output. 但是我不知道如何将file1的第二列添加到输出中。
You can store the value of $2
in file1
into the array using a[$1]=$2
. 您可以使用
a[$1]=$2
file1
中$2
的值存储到数组中。 So you could try: 因此,您可以尝试:
awk '
NR==FNR{
a[$1]=$2 ; next }
$2 in a {
$1=$1 FS a[$2]
print
}' file1 file2
Output: 输出:
chr5 b ENST00000504031 utr5 0 +
chr5 b ENST00000504031 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
Explanation: 说明:
$1
in file2
using $1=$1 FS a[$2]
where FS
is the default field separator, which is a space.. and then rebuilds the record, such that it can be printed by print
later.. $1
在file2
使用$1=$1 FS a[$2]
其中FS
是默认字段分隔符,这是一个空间..然后重建的记录,以使得它可以通过被打印print
后.. print
can be simplified to a 1
if desired.. Like $2 in a { $1=$1 FS a[$2] }1
print
简化为1
$2 in a { $1=$1 FS a[$2] }1
file2
and thus any sequences of spaces or tabs will be truncated to a single space in the output. file2
的记录,因此任何空格或制表符序列都将被截断为输出中的单个空格。 To keep the original formatting in file2
one could use the split()
function in Gnu Awk version 4.. file2
可以使用Gnu Awk版本4中的split()
函数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.