简体   繁体   中英

Python search csv file from input text file

I'm new to python and I struggling with this code. Have 2 file, 1st file is text file containing email addresses (one each line), 2nd file is csv file with 5-6 columns. Script should take search input from file1 and search in file 2, the output should be stored in another csv file (only first 3 columns) see example below. Also I have copied a script that I was working on. If there is a better/efficient script then please let me know. Thank you, appreciate your help.

File1 (output.txt)
rrr@company.com
eee@company.com
ccc@company.com

File2 (final.csv)
Sam,Smith,sss@company.com,admin
Eric,Smith,eee@company.com,finance
Joe,Doe,jjj@company.com,telcom
Chase,Li,ccc@company.com,IT

output (out_name_email.csv)
Eric,Smith,eee@company.com
Chase,Li,ccc@company.com

Here is the script

import csv
outputfile = 'C:\\Python27\\scripts\\out_name_email.csv'
inputfile = 'C:\\Python27\\scripts\\output.txt'
datafile = 'C:\\Python27\\scripts\\final.csv'

names=[]

with open(inputfile) as f:
    for line in f:
        names.append(line)

with open(datafile, 'rb') as fd, open(outputfile, 'wb') as fp_out1:
    writer = csv.writer(fp_out1, delimiter=",")
    reader = csv.reader(fd, delimiter=",")
    headers = next(reader)
    for row in fd:
        for name in names:
            if name in line:
                writer.writerow(row)

Load the emails into a set for O(1) lookup:

with open(inputfile) as fin:
    emails = set(line.strip() for line in fin)

Then loop over the rows once, and check it exists in emails - no need to loop over each possible match for each row:

# ...
for row in reader:
    if row[1] in emails:
        writer.writerow(row)

If you're not doing anything else, then you can make it:

writer.writerows(row for row in reader if row[1] in emails)

A couple of notes, in your original code you're not using the csv.reader object reader - you're looping over fd and you appear to have some naming issues with names and line and row ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM