[英]How to order the content of a set of files based on one file with Python
I'm stuck in a problem regarding strings order inside 3 text files.我遇到了有关 3 个文本文件中字符串顺序的问题。 Here below and example:下面是示例:
file_A.txt
ID value
AAA 1
BBB 2
CCC 3
file_B.txt
ID value
BBB 2
AAA 1
CCC 3
file_C.txt
ID value
CCC 3
AAA 1
BBB 2
As you can see, all the 3 files contain strings and these strings are the same for each file but in different order.如您所见,所有 3 个文件都包含字符串,并且这些字符串对于每个文件都是相同的,但顺序不同。 I'd like to use the fileA.txt
strings order as footprint for the other files in order to modify fileB.txt
and fileC.txt
like this:我想使用fileA.txt
字符串顺序作为其他文件的足迹,以便像这样修改fileB.txt
和fileC.txt
:
file_A.txt
ID value
AAA 1
BBB 2
CCC 3
file_B.txt
ID value
AAA 1
BBB 2
CCC 3
file_C.txt
ID value
AAA 1
BBB 2
CCC 3
Thanks for tips and help.感谢提示和帮助。
Given these files:鉴于这些文件:
$ head *.txt
==> file_a.txt <==
ID value
AAA 1
CCC 3
BBB 2
==> file_b.txt <==
ID value
BBB 2
AAA 1
CCC 3
==> file_c.txt <==
ID value
CCC 3
AAA 1
BBB 2
You could do something like this:你可以这样做:
from pathlib import Path
p=Path(path_to_files)
id_file=Path('path_to_files/file_a.txt')
# create an index from file_a:
with open(id_file) as f:
next(f)
idx={line.split()[0]:i for i, line in enumerate(f)}
# for the other files, use that index to sort
for fn in p.glob('file_[bc].txt'): # glob could be 'file_[!a].txt' too
with open(fn, 'r+') as f:
dat=f.readlines()
f.seek(0)
for line in [dat[0]]+sorted(dat[1:],
key=lambda l: idx.get(l.split()[0],0)):
f.write(line)
Result:结果:
$ head *.txt
==> file_a.txt <==
ID value
AAA 1
CCC 3
BBB 2
==> file_b.txt <==
ID value
AAA 1
CCC 3
BBB 2
==> file_c.txt <==
ID value
AAA 1
CCC 3
BBB 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.