[英]How to get the data with the same column in two txt file to an new txt?
I need to get over 200,000 annotations from the original annotation txt file(B.txt) by comparing the first column string.通过比较第一列字符串,我需要从原始注释 txt 文件(B.txt)中获取超过 200,000 个注释。
For example:例如:
A.txt is like A.txt 就像
00001.jpg 00001.jpg
00002.jpg 00002.jpg
00004.jpg 00004.jpg
... ...
B.txt is like B.txt 就像
00001.jpg 12 3 1 33 00001.jpg 12 3 1 33
00002.jpg 32 4 2 2 00002.jpg 32 4 2 2
00003.jpg 23 4 5 1 00003.jpg 23 4 5 1
00004.jpg 3 5 3 1 00004.jpg 3 5 3 1
00005.jpg 2 4 1 1 00005.jpg 2 4 1 1
... ...
I want get a C.txt like我想要一个 C.txt 之类的
00001.jpg 12 3 1 33 00001.jpg 12 3 1 33
00002.jpg 32 4 2 2 00002.jpg 32 4 2 2
00004.jpg 3 5 3 1 00004.jpg 3 5 3 1
... ...
The code I worte seems like can't get any line wrote in C.txt我写的代码似乎无法在 C.txt 中写入任何行
alines = open('A.txt', 'r').readlines()
blines = open('B.txt', 'r').readlines()
fw = open('C.txt', 'w')
for al in alines:
for bl in blines:
if str(al) in str(bl):
fw.write(bl)
fw.close()
Your code doesn't work because alines
and blines
lists contain the lines ending with the '\n' symbols so the comparison always fails.您的代码不起作用,因为
alines
和blines
列表包含以 '\n' 符号结尾的行,因此比较总是失败。
The following code strips the '\n' symbols and also eliminates the second "for" cycle:以下代码去除了 '\n' 符号并消除了第二个“for”循环:
with open('A.txt', 'r') as fh:
# Splitlines gets rid of the '\n' endlines
alines = fh.read().splitlines()
with open('B.txt', 'r') as fh:
# Splitlines gets rid of the '\n' endlines
blines = fh.read().splitlines()
with open('C.txt', 'w') as fh:
for line in blines:
# Split the file name
parts = line.split(' ', 1)
# Look up the filename
if parts[0] in alines:
fh.write(line + '\n')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.