简体   繁体   中英

Compare two differently ordered columns from two different files in linux

I have:

cat file1.txt
Id     val      len   val2
256171 343143 561565 55165
448188 565165 451554 68168
233143 234123 262126 62124
777615 886188 874164 68148
667176 233143 634163 45134

cat file2.txt
Val2    Id
251671 343143 
767167 342134 
233143 234123 
667176 233143 

I want to compare file1 and file2 but the order of the columns is different in each file.

I saw a lot of answers using awk when the order of the columns between the two files is identical, but how do I specify which columns to compare explicitly?

In this case I need to compare column 1 in file1 with column 2 in file2 , and column 4 in file1 with column1 in file2 .

My goal is to output all lines that have the same value in both columns that are being compared.

You can just put together the files with paste and then use awk to compare the columns you want. Like below:

# match values of Id columns
paste file1.txt file2.txt | awk  -F" " 'NR>1{print ($1==$6?$1:_)}'

# output
233143

# match values of val2 columns
paste file1.txt file2.txt | awk  -F" " 'NR>1{print ($4==$5?$4:_)}'

# output
68148

# where file1.txt and file2.txt are:
cat file1.txt
Id     val      len   val2
256171 343143 561565 55165
448188 565165 451554 68168
233143 234123 262126 62124
777615 886188 874164 68148
667176 233143 634163 45134

# cat file2.txt
Val2    Id
251671 343143
767167 342134
233143 233143
68148  233143
667176 45134

Description:

  • NR>1 skip first line (column header)
  • -F field separator set to blank space
  • {print ($1==$6?$1:_) -> print only the matching fields between column 1 and 6 (the Id columns after applying paste to the files)
  • {print ($4==$5?$4:_)} -> print only the matching fields between column 4 and 5 (the val2 columns after applying paste to the files)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM