简体   繁体   中英

awk Compare 7 files, print match and difference based on Fileds:

I need to comapre 7 files Ref.txt and Jan.txt to Jun.txt and obtain matches, and non-matches, for this case I am looking to check Second field of Ref.txt with all the First field of Jan.txt to Jun.txt,if yes then print all the fileds of Ref.txt (Master dump), then print the entire line of Jan.txt to Jun.txt. And for no match found on Jan.txt to Jun.txt to state "NotFound" .

Ref.txt

abc 10  xxyyzz
bdc 20  xxyyzz
edf 30  xxyyzz
ghi 40  xxyyzz
ofg 50  xxyyzz
mgf 60  xxyyzz

Jan.txt

10  Jan 100
30  Jan 300
50  Jan 500

Feb.txt

10  Feb 200
20  Feb 400
40  Feb 800
60  Feb 1200

Mar.txt

20  Mar 600
50  Mar 1500

Apr.txt

10  Apr 100
30  Apr 300
50  Apr 500

May.txt

10  May 200
20  May 400
40  May 800
60  May 1200

Jun.txt

20  Jun 600
50  Jun 1500

Desired Output:

Ref.txt Ref.txt Ref.txt Jan.txt Jan.txt Jan.txt Feb.txt Feb.txt Feb.txt Mar.txt Mar.txt Mar.txt Apr.txt Apr.txt Apr.txt May.txt May.txt May.txt Jun.txt Jun.txt Jun.txt
abc 10  xxyyzz  10  Jan 100 10  Feb 200 Notfound    Notfound    Notfound    10  Apr 100 10  May 200 Notfound    Notfound    Notfound
bdc 20  xxyyzz  Notfound    Notfound    Notfound    20  Feb 400 20  Mar 600 Notfound    Notfound    Notfound    20  May 400 20  Jun 600
edf 30  xxyyzz  30  Jan 300 Notfound    Notfound    Notfound    Notfound    Notfound    Notfound    30  Apr 300 Notfound    Notfound    Notfound    Notfound    Notfound    Notfound
ghi 40  xxyyzz  Notfound    Notfound    Notfound    40  Feb 800 Notfound    Notfound    Notfound    Notfound    Notfound    Notfound    40  May 800 Notfound    Notfound    Notfound
ofg 50  xxyyzz  50  Jan 500 Notfound    Notfound    Notfound    50  Mar 1500    50  Apr 500 Notfound    Notfound    Notfound    50  Jun 1500
mgf 60  xxyyzz  Notfound    Notfound    Notfound    60  Feb 1200    Notfound    Notfound    Notfound    Notfound    Notfound    Notfound    60  May 1200    Notfound    Notfound    Notfound

Thanks in advance for your reply

Here's a gift: please ask questions for stuff you don't understand

awk '
    FNR == 1 { 
        printf "%s %s %s\t", FILENAME, FILENAME, FILENAME 
        if (NR > FNR) file[++num_files] = FILENAME 
    }
    NR == FNR {
        id[NR] = $2
        ref[NR] = $0
        num_ids++
        next
    }
    { value[FILENAME,$1] = $0 }
    END {
        print ""
        for (row=1; row<=num_ids; row++) {
            printf "%s\t", ref[row]
            for (f=1; f<=num_files; f++) {
                key = file[f] SUBSEP id[row]
                printf "%s\t", (key in value ? value[key] : "Notfound")
            }
            print ""
        }
    }
' {Ref,Jan,Feb,Mar,Apr,May,Jun}.txt
Ref.txt Ref.txt Ref.txt Jan.txt Jan.txt Jan.txt Feb.txt Feb.txt Feb.txt Mar.txt Mar.txt Mar.txt Apr.txt Apr.txt Apr.txt May.txt May.txt May.txt Jun.txt Jun.txt Jun.txt 
abc 10  xxyyzz  10  Jan 100 10  Feb 200 Notfound    10  Apr 100 10  May 200 Notfound    
bdc 20  xxyyzz  Notfound    20  Feb 400 20  Mar 600 Notfound    20  May 400 20  Jun 600 
edf 30  xxyyzz  30  Jan 300 Notfound    Notfound    30  Apr 300 Notfound    Notfound    
ghi 40  xxyyzz  Notfound    40  Feb 800 Notfound    Notfound    40  May 800 Notfound    
ofg 50  xxyyzz  50  Jan 500 Notfound    50  Mar 1500    50  Apr 500 Notfound    50  Jun 1500    
mgf 60  xxyyzz  Notfound    60  Feb 1200    Notfound    Notfound    60  May 1200    Notfound    

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM