[英]how to join multiple sorted files in Python alphabetically?
How can I read multiple CSV input files line by line, compare the characters in each line, write the line appearing first alphabetically to an output file, and then advance the pointer of the minimum value's file to continue the comparisons with all files until the end of all input files is reached. 如何逐行读取多个CSV输入文件,比较每行中的字符,将第一行按字母顺序写入输出文件,然后前进最小值文件的指针,继续与所有文件进行比较,直到结束到达所有输入文件。 Here's some rough planning toward a solution. 这是一个针对解决方案的粗略计划。
buffer = []
for inFile in inFiles:
f = open(inFile, "r")
line = f.next()
buffer.append([line, inFile])
#find minimum value in buffer alphabetically...
#write it to an output file...
#how do I advance one line in the file with the min value?
#and then continue the line-by-line comparisons in input files?
You can use heapq.merge
: 你可以使用heapq.merge
:
import heapq
import contextlib
files = [open(fn) for fn in inFiles]
with contextlib.nested(*files):
with open('output', 'w') as f:
f.writelines(heapq.merge(*files))
In Python 3.x (3.3+): 在Python 3.x(3.3+)中:
import heapq
import contextlib
with contextlib.ExitStack() as stack:
files = [stack.enter_context(open(fn)) for fn in inFiles]
with open('output', 'w') as f:
f.writelines(heapq.merge(*files))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.