讀取兩個文件並根據第一個文件的列過濾第二個文件

Question

我有一個包含關鍵字的輸入文件，並且有需要根據這些關鍵字過濾的csv文件。

這是我嘗試使用python自動執行任務。

import csv
with open('Input.txt', 'rb') as InputFile:
    with open('28JUL2017.csv', 'rb') as CM_File:
        read_Input=csv.reader(InputFile)
        for row1 in csv.reader(InputFile):
            #print row1

            read_CM=csv.reader(CM_File)
            next(read_CM, None)
            for row2 in csv.reader(CM_File):
                #print row2
                if row1[0] == row2[0] :

                    Output= row2[0]+","+row2[1]+","+row2[5]+","+row2[6]
                    print Output

我只是從要過濾的文件的第一行。 嘗試了各種事情，但無法理解我要去哪里。 請在這里為我指出錯誤。

Answer 1

read_Input和read_CM本質上是迭代器。 一旦遍歷它們-您就完成了：您不能重復兩次。 如果您堅持要這樣做，那么每次您要開始新的循環並“重新讀取” CSV文件時，都必須倒回到文件的開頭。 解決方法：

import csv
with open('file1.csv', 'rb') as InputFile:
    with open('file2.csv', 'rb') as CM_File:
        read_Input=csv.reader(InputFile)
        for row1 in csv.reader(InputFile):
            CM_File.seek(0) # rewind to the beginning of the file
            read_CM=csv.reader(CM_File)
            next(read_CM, None)
            for row2 in csv.reader(CM_File):
                if row1[0] == row2[0] :
                    Output= row2[0]+","+row2[1]+","+row2[5]+","+row2[6]
                    print Output

取而代之的是，我建議您遍歷已讀取的行，而不是重新讀取文件。 另外，創建嵌套的“關鍵字”列表，而不是嵌套循環，只需檢查row2[0]是否在該列表中即可：

import csv
with open('file1.csv', 'rb') as InputFile:
    with open('file2.csv', 'rb') as CM_File:
        read_Input = csv.reader(InputFile) # read file only once
        keywords = [rec[0] for rec in read_Input]
        read_CM = csv.reader(CM_File) # read file only once
        next(read_CM, None) # not sure why you do this? to skip first line?
        for row2 in read_CM:
            if row2[0] in keywords:
                Output = row2[0]+","+row2[1]+","+row2[5]+","+row2[6]
                print("Output: {}".format(Output))

讀取兩個文件並根據第一個文件的列過濾第二個文件

問題描述

1 個解決方案

解決方案1
1 已采納 2017-07-31 18:34:35

讀取兩個文件並根據第一個文件的列過濾第二個文件

問題描述

1 個解決方案

解決方案1 1 已采納 2017-07-31 18:34:35

解決方案1
1 已采納 2017-07-31 18:34:35