简体   繁体   English

如何使用python逐行连接多个CSV文件?

[英]How do I concatenate multiple CSV files row-wise using python?

I have a dataset of about 10 CSV files. 我有大约10个CSV文件的数据集。 I want to combine those files row-wise into a single CSV file. 我想将这些文件按行合并到一个CSV文件中。

What I tried: 我试过的

import csv
fout = open("claaassA.csv","a")
# first file:
writer = csv.writer(fout)
for line in open("a01.ihr.60.ann.csv"):
     print line
     writer.writerow(line)
# now the rest:    
for num in range(2, 10):
    print num
    f = open("a0"+str(num)+".ihr.60.ann.csv")
#f.next() # skip the header
for line in f:
     print line
     writer.writerow(line)
#f.close() # not really needed
fout.close()

Definitively need more details in the question (ideally examples of the inputs and expected output). 绝对需要问题中的更多细节(理想情况是输入和预期输出的示例)。

Given the little information provided, I will assume that you know that all files are valid CSV and they all have the same number or lines (rows). 鉴于提供的信息很少,我将假定您知道所有文件都是有效的CSV,并且它们的编号或行数(行)都相同。 I'll also assume that memory is not a concern (ie they are "small" files that fit together in memory). 我还将假定内存不是问题(例如,它们是可放入内存中的“小”文件)。 Furthermore, I assume that line endings are new line ( \\n ). 此外,我假设行尾是新行( \\n )。

If all these assumptions are valid , then you can do something like this: 如果所有这些假设都成立 ,那么您可以执行以下操作:

input_files = ['file1.csv', 'file2.csv', 'file3.csv']
output_file = 'output.csv'

output = None
for infile in input_files:
    with open(infile, 'r') as fh:
        if output:
            for i, l in enumerate(fh.readlines()):
                output[i] = "{},{}".format(output[i].rstrip('\n'), l)
        else:
            output = fh.readlines()

with open(output_file, 'w') as fh:
    for line in output:
        fh.write(line) 

There are probably more efficient ways, but this is a quick and dirty way to achieve what I think you are asking for. 可能有更有效的方法,但这是实现我认为您要求的快速而肮脏的方法。


The previous answer implicitly assumes we need to do this in python. 先前的答案隐式地假定我们需要在python中执行此操作。 If bash is an option then you could use the paste command. 如果可以选择bash,则可以使用paste命令。 For example: 例如:

paste -d, file1.csv file2.csv file3.csv > output.csv

I don't understand fully why you use the library csv . 我不完全理解为什么您使用库csv Actually, it's enough to fill the output file with the lines from given files (it they have the same columns' manes and orders). 实际上,用给定文件中的行填充输出文件就足够了(它们具有相同列的鬃毛和顺序)。

input_path_list = [
    "a01.ihr.60.ann.csv",
    "a02.ihr.60.ann.csv",
    "a03.ihr.60.ann.csv",
    "a04.ihr.60.ann.csv",
    "a05.ihr.60.ann.csv",
    "a06.ihr.60.ann.csv",
    "a07.ihr.60.ann.csv",
    "a08.ihr.60.ann.csv",
    "a09.ihr.60.ann.csv",
]
output_path = "claaassA.csv"

with open(output_path, "w") as fout:
    header_written = False

    for intput_path in input_path_list:
        with open(intput_path) as fin:
            header = fin.next()

            # it adds the header at the beginning and skips other headers
            if not header_written:
                fout.write(header)
                header_written = True

            # it adds all rows
            for line in fin:
                fout.write(line)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM