简体   繁体   中英

how to join multiple sorted files in Python alphabetically?

How can I read multiple CSV input files line by line, compare the characters in each line, write the line appearing first alphabetically to an output file, and then advance the pointer of the minimum value's file to continue the comparisons with all files until the end of all input files is reached. Here's some rough planning toward a solution.

buffer = []

for inFile in inFiles:

    f = open(inFile, "r")
    line = f.next()
    buffer.append([line, inFile])

#find minimum value in buffer alphabetically...
#write it to an output file...

#how do I advance one line in the file with the min value?
#and then continue the line-by-line comparisons in input files?

You can use heapq.merge :

import heapq
import contextlib

files = [open(fn) for fn in inFiles]
with contextlib.nested(*files):
    with open('output', 'w') as f:
        f.writelines(heapq.merge(*files))

In Python 3.x (3.3+):

import heapq
import contextlib

with contextlib.ExitStack() as stack:
    files = [stack.enter_context(open(fn)) for fn in inFiles]
    with open('output', 'w') as f:
        f.writelines(heapq.merge(*files))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM