简体   繁体   中英

How to compare the lines in two files in the columns ID?

I have two files with dynamic length from 1 to 30 lines, and these data:

[File1] 
Time | Name | Name | ID1 | ID2 
10:50 | Volume | Xxx | 55 | 65 
12:50 | Kate | Uh | 35 | 62 
15:50 | Maria | Zzz | 38 | 67 
15:50 | Alex | Web | 38 | 5 
... 

[File2] 
Time | Name | Name | ID1 | ID2 
10:50 | Les | Xxx | 31 | 75 
15:50 | Alex | Web | 38 | 5 
... 

How to compare two files [only ID1 and ID2 columns]: [File1] and [File2] to all first lines of the file {File1] compared with all lines of {File2]. If data exists in both files saved to a file [File3] data adding character * In addition to the file {File3] have hit other data from [File1].

Result:

[File3] 
Time | Name | Name | ID1 | ID2 
15:50 | Alex | Web | * 38 | 5 
10:50 | Volume | Xxx | 55 | 65 
12:50 | Kate | Uh | 35 | 62 
15:50 | Maria | Zzz | 38 | 67 

Using awk

awk  'BEGIN{t="Time | Name | Name | ID1 | ID2"}
FNR==1{next}
NR==FNR{a[$4 FS $5];next}
{ if ($4 FS $5 in a)
       {$4="*"$4;t=t RS $0}
  else{s=s==""?$0:s RS $0}
}
END{print t RS s}' FS=\| OFS=\| file2 file1

Time | Name | Name | ID1 | ID2
15:50 | Alex | Web |* 38 | 5
10:50 | Volume | Xxx | 55 | 65
12:50 | Kate | Uh | 35 | 62
15:50 | Maria | Zzz | 38 | 67

Explanation

BEGIN{t="Time | Name | Name | ID1 | ID2"}   # set the title
FNR==1{next}                                # ignore the title, FNR is the current record number in the current file.for each file
NR==FNR{a[$4 FS $5];next}                   # record the $4 and $5 into Associative array a
{ if ($4 FS $5 in a)                    
{$4="*"$4;t=t RS $0}                        # if found in file1, mark the $4 with start "*" and attach to var t
else{s=s==""?$0:s RS $0}                    # if not found, attach to var s
{print t RS s}                              # print the result.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM