简体   繁体   中英

How to take value from 1 txt file and append to another one?

I have 2 text files. "A.txt" contains

A 1 AB ... 1 5 -3 4.5 (contains 11 columns. So "4.5" is in the 11th column)
A 2 BC ... -2 3 8 9.2
A 3 WE ... 2 3 8 5.2
A 4 RT ...  23 2 24 4.1 
...
END

"B.txt" is similar except that the final 2 columns differ from that of "A.txt". Another difference is that "B.txt" contains some additional lines not in "A.txt". For example, the third line A 3 QEW ... 5 23 34 5 is in "B.txt" but not in "A.txt"

A 1 AB ... 1 5 4 9
A 2 BC ... -2 3 1 0
A 3 QEW ... 5 23 34 5
A 4 WE ... 2 3 -7 56
A 5 RT ...  23 2 -5 14 
...
END

What I want to do is extract the value of the last column in each line of "A.txt" and append it to the corresponding line in "B.txt". And for each line in "B.txt" that is not in "A.txt", I want to append the value 1 if the 3rd column element begins with the letter "Q" (for example, QEW) and the value 2 otherwise. So the output should look like

A 1 AB ... 1 5 4 9 4.5
A 2 BC ... -2 3 1 0 9.2
A 3 QEW ... 5 23 34 5 1 
A 4 WE ... 2 3 -7 56 5.2
A 5 RT ...  23 2 -5 14 4.1
...
END

I tried the code below but it generated no output. Am I doing something wrong?

def main():
        #enter python code.py A.txt B.txt in command line
        A = open(sys.argv[1])

        AAlist = []
        TE = []
        i=1
        for line in A:
            linestr = ' '.join(line.split())   
            if linestr[1]==i:
                AAlist.append(linestr[2])
                TE.append(linestr[10])
            i+=1

        BAlist = []
        i=0
        j=0
        with open(sys.argv[2]) as B, open('outputpy.txt', 'w') as out_file:
            for line in B:
                linestr = ' '.join(line.split())   
                if linestr[1]==j:
                    at = linestr[2]
                    BAlist.append(atm)
                    if at!=AAlist[i]:
                        if at[0]=='Q':
                            out_file.write(1)
                        else:
                            out_file.write(2)              

                    #print >> outfile
                    out_file.write(TE[i])
                    i+=1
                    j+=1
        print "finished"

Is there a way to do the manipulation I want using Linux commands? Is it any easier than the Python code?

EDIT: I showed what the output should look like

If I understood you correctly, the following awk script should do what you want:

NR==FNR{
    arr[$3] = $11
    next
}
{
    if ($3 in arr){
        print($0, arr[$3])
    }else if ($3 == "^Q"){
        print($0, "2")
    }else{
        print($0, "1")
    }
}

Run it with

awk -f script.awk f1 f2

NR==FNR is true for the first file and false for all other files, so in the first part we only fill an array that is called arr in this case. It is filled with the key of the 3rd field and has the value of the 11th field. You can replace $3 with eg $1$2$3 if the match in the second file is done with the first three fields (see also my comment under your question).

For the second file, if the key can be found in the array, we append it. If not, we check for the first letter of the 3rd field to be a Q . If that is the case we append a 2. Otherwise we append a 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM