简体   繁体   中英

Compare two files having different column numbers and print the requirement to a new file if condition satisfies

I have two files with more than 10000 rows:

File1 has 1 col      File2 has 4 col     
23                   23 88 90 0
34                   43 74 58 5
43                   54 87 52 3
54                   73 52 35 4 
.                    .
.                    .

I want to compare each value in file-1 with that in file-2. If exists then print the value along with other three values in file-2. In this example output will be:

 23 88 90 0
 43 74 58 5
 54 87 52 3
 .
 .

I have written following script, but it is taking too much time to execute.

s1=1; s2=$(wc -l < File1.txt)
while [ $s1 -le $s2 ]
do n=$(awk 'NR=="$s1" {print $1}' File1.txt)
   p1=1; p2=$(wc -l < File2.txt)
   while [ $p1 -le $p2 ]
   do awk '{if ($1==$n) printf ("%s %s %s %s\n", $1, $2, $3, $4);}'> ofile.txt
   (( p1++ ))
   done
(( s1++ ))
done

Is there any short/ easy way to do it?

You can do it very shortly using awk as

awk 'FNR==NR{found[$1]++; next} $1 in found'

Test

>>> cat file1
23
34
43
54

>>> cat file2
23 88 90 0
43 74 58 5
54 87 52 3
73 52 35 4

>>> awk 'FNR==NR{found[$1]++; next} $1 in found' file1 file2
23 88 90 0
43 74 58 5
54 87 52 3

What it does?

  • FNR==NR Checks if FNR file number of record is equal to NR total number of records. This will be same only for the first file, file1 because FNR is reset to 1 when awk reads a new file.

    • {found[$1]++; next} {found[$1]++; next} If the check is true then creates an associative array indexed by $1 , the first column in file1
  • $1 in found This check is only done for the second file, file2 . If column 1 value, $1 is and index in associative array found then it prints the entire line ( which is not written because it is the default action)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM