简体   繁体   中英

Filter CSV File Program using Pandas and Python

I currently have a task that involves downloading a CSV master file, removing any lines where column A - Column B <= 0, and where Column C equals a given phrase. I'm looking to a create a program that will:

  • Import a CSV File
  • Remove all lines where Column A - Column B <= 0
  • Ask for input to filter on Column C for one or more phrases
  • Export the CSV into a new file

So far, I have determined that the best way to do this is to use Pandas' dataframe functionality, as I've used it previously to perform other operations on CSV files:

 import pandas as pd file = read_csv("sourcefile.csv") file['NewColumn'] = file['A'] - file['B'] file = file[file.NewColumn > 0] columns = ['ColumnsIWantToRemove'] file.drop(columns, inplace=True, axis=1) phrases = input('What phrases are you filtering for? ') file = file[file.C = phrases] file.to_csv('export.csv') 

My question is, how do I filter Column C for multiple phrases? I want the program to take one or more phrases and only show rows where Column C's value equals one of those values. Any guidance would be amazing. Thank you!!

I would just ask for input to be comma separated:

phrases = phrases.split(",")
file = file[file.C.isin(phrases)]

maybe this can help you :

import csv

input = open(sourcefile.csv, 'rb')
output = open(out_sourcefile, 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
    if (phrases you want C column not to be,and you can add here multiple phrases):
        continue
        writer.writerow(row)
input.close()
output.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM