简体   繁体   English

在python中的一行中连接大量文件

[英]Concatenate a large number of files line for line in python

thanks for lending your eyes here. 谢谢你在这里借你的眼睛。

I'm processing some spectral data that is in the form of several hundred text files (1.txt,2.txt,3.txt ...) and they are all formatted with the exact same number of lines like this: For clarity: 我正在处理几百个文本文件(1.txt,2.txt,3.txt ...)形式的光谱数据,它们的格式都完全相同,如下所示:为了清楚起见:

1.txt:             2.txt:            3.txt:
1,5                1,4               1,7
2,8                2,9               2,14
3,10               3,2               3,5
4,13               4,17              4,9
<...>              <...>             <...>
4096,1             4096,7            4096,18

I'm attempting to concatenate them line-by-line so at I walk away with one output file like: 我试图逐行连接它们,所以我走开了一个输出文件,例如:

5,4,7
8,9,14
10,2,5
13,17,9
<...>
1,7,18

I'm very new to Python, and I'd really appreciate some help here. 我是Python的新手,非常感谢您的帮助。 I've attempted this mess: 我尝试过这种混乱:

howmanyfiles=8
output=open('output.txt','w+')
for j in range(howmanyfiles):
    fp=open(str(j+1) + '.txt','r')
    if j==0:
        for i, line in enumerate(fp):
            splitline=line.split(",")
            output.write(splitline[1])
    else:
        output.close()
        output=open('output.txt','r+')
        for i, line in enumerate(fp):
            splitline=line.split(",")
            output.write(output.readline(i)[:-1]+","+splitline[1])
    fp.close()
output.close()

My line of thinking in the above is that I need to place the cursor back at the beginning of the document for each file.. but it's really blowing up in my face. 我在上面的思路是,我需要将光标放回到每个文件的文档开头。.但是,它的确在我的脸上膨胀了。

Thanks dearly. 非常感谢。

-matt -matt

I think you can get a lot of mileage out of the zip built-in function, which will let you iterate over all the input files at the same time: 我认为您可以从zip内置功能中受益匪浅,它将使您可以同时遍历所有输入文件:

from contextlib import ExitStack

num_files = 8
with open("output.txt", "w") as output, ExitStack() as stack:
    files = [stack.enter_context(open("{}.txt".format(i+1)))
             for i in range(num_files)]
    for lines in zip(*files): # lines is a tuple with one line from each file
        new_line = ",".join(line.partition(',')[2] for line in lines) + "\n"
        file.write(new_line)

Here's a fun way to do it with generators: 这是一种使用生成器的有趣方法:

import sys

files     = sys.argv[1:]
handles   = (open(f) for f in files)
readers   = ((line.strip() for line in h) for h in handles)
splitters = ((line.split(',')[1] for line in r) for r in readers)
joiners   = (",".join(tuple(s)) for s in splitters)

for j in joiners:
    print j

You might also look into the Unix paste command 您可能还会查看Unix 粘贴命令

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM