I have two files and I would like to match column 2 and 3 from file1
with column 2 and 3 from file3
. If the pattern is found, I would like to output the whole line from file2
with, in addition column 1 from file1
at the end:
I have the following two file-types: ( file2
has a lot of columns ( tab
seperated) but, columns 2 and 3 can match 2 and three from file1
. )
file1
name1 1 12343442
name2 2 32434242
name3 3 982793749
file2
a 1 12343442 text1 text2 text3 value0 value2
a 1 12343442 text1 text2 text3 value2 value3
a 1 12348888 text1 text2 text3 value0 value2
b 3 982793749 text1 text4 text3 value1 value11
b 2 982793749 text1 text4 text3 value1 value11
desired output
a 1 12343442 text1 text2 text3 value0 value2 name1
a 1 12343442 text1 text2 text3 value2 value3 name1
b 3 982793749 text1 text4 text3 value1 value11 name3
I have tried doing this using awk
. Something like:
awk 'BEGIN { FS = "\t" } NR==FNR { a[$1]=$2 FS $3; next} ('$2 FS $3' in a) {print $0, a[$1]}' file1 file2
But it doesnt work. Even if I just try to match the third columns it does not work. The files are pretty big >500mb so I would like to read them only once. Any ideas? Thank you!
this one-liner should work :
awk -F'\t' -v OFS='\t' 'NR==FNR{a[$2FS$3]=$1;next}$2FS$3 in a{print $0,a[$2FS$3]}' file1 file2
in your codes
a[$1]=$2 FS $3;next
, you were confused by the key
and value
. here you wanted the $2FS$3
to be key, and $1
to be the value. ('$2 FS $3' in a)
is not correct either, remove the single-quotes
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.