简体   繁体   English

使用CSV模块附加多个文件,同时删除附加的标题

[英]Using CSV module to append multiple files while removing appended headers

I would like to use the Python CSV module to open a CSV file for appending. 我想使用Python CSV模块打开要附加的CSV文件。 Then, from a list of CSV files, I would like to read each csv file and write it to the appended CSV file. 然后,从CSV文件列表中,我想读取每个csv文件并将其写入附加的CSV文件。 My script works great - except that I cannot find a way to remove the headers from all but the first CSV file being read. 我的脚本工作得很好-除了无法找到一种方法来从所有读取的第一个CSV文件中删除标头之外,我的脚本非常有用。 I am certain that my else block of code is not executing properly. 我确信我的else代码块无法正确执行。 Perhaps my syntax for my if else code is the problem? 也许我的if else代码的语法有问题? Any thoughts would be appreciated. 任何想法将不胜感激。

writeFile = open(append_file,'a+b')
writer = csv.writer(writeFile,dialect='excel')
    for files in lstFiles:
        readFile = open(input_file,'rU')
        reader = csv.reader(readFile,dialect='excel')
        for i in range(0,len(lstFiles)):
            if i == 0:
                oldHeader = readFile.readline() 
                newHeader = writeFile.write(oldHeader) 
                for row in reader: 
                    writer.writerow(row)
            else:
                reader.next()
                for row in reader:
                    row = readFile.readlines()
                    writer.writerow(row)
        readFile.close()
writeFile.close() 

You're effectively iterating over lstFiles twice. 您实际上对lstFiles进行了两次迭代。 For each file in your list, you're running your inner for loop up from 0. You want something like: 对于列表中的每个文件,您都在从0运行内部for循环。您需要以下内容:

writeFile = open(append_file,'a+b')
writer = csv.writer(writeFile,dialect='excel')
headers_needed = True
for input_file in lstFiles:
    readFile = open(input_file,'rU')
    reader = csv.reader(readFile,dialect='excel')
    oldHeader = reader.next()
    if headers_needed:
        newHeader = writer.writerow(oldHeader)
        headers_needed = False 
    for row in reader:
        writer.writerow(row)
    readFile.close()
writeFile.close()

You could also use enumerate over the lstFiles to iterate over tuples containing the iteration count and the filename, but I think the boolean shows the logic more clearly. 您还可以在lstFiles上使用enumerate来对包含迭代计数和文件名的元组进行迭代,但是我认为布尔值可以更清楚地显示逻辑。

You probably do not want to mix iterating over the csv reader and directly calling readline on the underlying file. 您可能不希望在csv阅读器上混合迭代并直接在基础文件上调用readline。

I think you're iterating too many times (over various things: both your list of files and the files themselves). 我认为您要迭代太多次(在各种事情上:文件列表和文件本身)。 You've definitely got some consistency problems; 您肯定有一些一致性问题; it's a little hard to be sure since we can't see your variable initializations. 由于我们看不到您的变量初始化,因此很难确定。 This is what I think you want: 这就是我想要的:

with open(append_file,'a+b') as writeFile:
    need_headers = True
    for input_file in lstFiles:
        with open(input_file,'rU') as readFile:
            headers = readFile.readline()
            if need_headers:
                # Write the headers only if we need them
                writeFile.write(headers)
                need_headers = False
            # Now write the rest of the input file.
            for line in readFile:
                writeFile.write(line)

I took out all the csv-specific stuff since there's no reason to use it for this operation. 我拿出了所有csv专用的东西,因为没有理由将其用于此操作。 I also cleaned the code up considerably to make it easier to follow, using the files as context managers and a well-named boolean instead of the "magic" i == 0 check. 我还使用文件作为上下文管理器和命名良好的布尔值(而不是i == 0的“魔术”),对代码进行了相当大的整理,以使其易于遵循。 The result is a much nicer block of code that (hopefully) won't have you jumping through hoops to understand what's going on. 结果是一个更好的代码块(希望)不会让您无所不用其极地了解正在发生的事情。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM