簡體   English   中英

讀取兩個文件並根據第一個文件的列過濾第二個文件

[英]read two files and filter second file based on a column of first file

我有一個包含關鍵字的輸入文件,並且有需要根據這些關鍵字過濾的csv文件。

這是我嘗試使用python自動執行任務。

import csv
with open('Input.txt', 'rb') as InputFile:
    with open('28JUL2017.csv', 'rb') as CM_File:
        read_Input=csv.reader(InputFile)
        for row1 in csv.reader(InputFile):
            #print row1

            read_CM=csv.reader(CM_File)
            next(read_CM, None)
            for row2 in csv.reader(CM_File):
                #print row2
                if row1[0] == row2[0] :

                    Output= row2[0]+","+row2[1]+","+row2[5]+","+row2[6]
                    print Output

我只是從要過濾的文件的第一行。 嘗試了各種事情,但無法理解我要去哪里。 請在這里為我指出錯誤。

read_Inputread_CM本質上是迭代器。 一旦遍歷它們-您就完成了:您不能重復兩次。 如果您堅持要這樣做,那么每次您要開始新的循環並“重新讀取”​​ CSV文件時,都必須倒回到文件的開頭。 解決方法:

import csv
with open('file1.csv', 'rb') as InputFile:
    with open('file2.csv', 'rb') as CM_File:
        read_Input=csv.reader(InputFile)
        for row1 in csv.reader(InputFile):
            CM_File.seek(0) # rewind to the beginning of the file
            read_CM=csv.reader(CM_File)
            next(read_CM, None)
            for row2 in csv.reader(CM_File):
                if row1[0] == row2[0] :
                    Output= row2[0]+","+row2[1]+","+row2[5]+","+row2[6]
                    print Output

取而代之的是,我建議您遍歷已讀取的行,而不是重新讀取文件。 另外,創建嵌套的“關鍵字”列表,而不是嵌套循環,只需檢查row2[0]是否在該列表中即可:

import csv
with open('file1.csv', 'rb') as InputFile:
    with open('file2.csv', 'rb') as CM_File:
        read_Input = csv.reader(InputFile) # read file only once
        keywords = [rec[0] for rec in read_Input]
        read_CM = csv.reader(CM_File) # read file only once
        next(read_CM, None) # not sure why you do this? to skip first line?
        for row2 in read_CM:
            if row2[0] in keywords:
                Output = row2[0]+","+row2[1]+","+row2[5]+","+row2[6]
                print("Output: {}".format(Output))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM