如何将两个txt文件中具有相同列的数据获取到新的txt？

Question

I need to get over 200,000 annotations from the original annotation txt file(B.txt) by comparing the first column string.通过比较第一列字符串，我需要从原始注释 txt 文件（B.txt）中获取超过 200,000 个注释。

For example:例如：

A.txt is like A.txt 就像

00001.jpg 00001.jpg

00002.jpg 00002.jpg

00004.jpg 00004.jpg

... ...

B.txt is like B.txt 就像

00001.jpg 12 3 1 33 00001.jpg 12 3 1 33

00002.jpg 32 4 2 2 00002.jpg 32 4 2 2

00003.jpg 23 4 5 1 00003.jpg 23 4 5 1

00004.jpg 3 5 3 1 00004.jpg 3 5 3 1

00005.jpg 2 4 1 1 00005.jpg 2 4 1 1

... ...

I want get a C.txt like我想要一个 C.txt 之类的

00001.jpg 12 3 1 33 00001.jpg 12 3 1 33

00002.jpg 32 4 2 2 00002.jpg 32 4 2 2

00004.jpg 3 5 3 1 00004.jpg 3 5 3 1

... ...

The code I worte seems like can't get any line wrote in C.txt我写的代码似乎无法在 C.txt 中写入任何行

alines = open('A.txt', 'r').readlines() 
blines = open('B.txt', 'r').readlines()
fw = open('C.txt', 'w')
for al in alines:
    for bl in blines:
        if str(al) in str(bl):
            fw.write(bl)
fw.close()

Answer 1

Your code doesn't work because alines and blines lists contain the lines ending with the '\n' symbols so the comparison always fails.您的代码不起作用，因为alines和blines列表包含以 '\n' 符号结尾的行，因此比较总是失败。

The following code strips the '\n' symbols and also eliminates the second "for" cycle:以下代码去除了 '\n' 符号并消除了第二个“for”循环：

with open('A.txt', 'r') as fh:
    # Splitlines gets rid of the '\n' endlines
    alines = fh.read().splitlines()
with open('B.txt', 'r') as fh:
    # Splitlines gets rid of the '\n' endlines
    blines = fh.read().splitlines()
with open('C.txt', 'w') as fh:
    for line in blines:
        # Split the file name
        parts = line.split(' ', 1)
        # Look up the filename
        if parts[0] in alines:
            fh.write(line + '\n')

如何将两个txt文件中具有相同列的数据获取到新的txt？

问题描述

1 个解决方案

解决方案1
0 2019-11-06 08:21:25

如何将两个txt文件中具有相同列的数据获取到新的txt？

问题描述

1 个解决方案

解决方案1 0 2019-11-06 08:21:25

解决方案1
0 2019-11-06 08:21:25