[英]Filter CSV File Program using Pandas and Python
I currently have a task that involves downloading a CSV master file, removing any lines where column A - Column B <= 0, and where Column C equals a given phrase. 我目前有一个任务,涉及下载CSV主文件,删除A列-B列<= 0,C列等于给定短语的行。 I'm looking to a create a program that will: 我正在寻找一个创建程序,该程序将:
So far, I have determined that the best way to do this is to use Pandas' dataframe functionality, as I've used it previously to perform other operations on CSV files: 到目前为止,我已经确定最好的方法是使用Pandas的数据框功能,因为我以前曾使用它对CSV文件执行其他操作:
import pandas as pd file = read_csv("sourcefile.csv") file['NewColumn'] = file['A'] - file['B'] file = file[file.NewColumn > 0] columns = ['ColumnsIWantToRemove'] file.drop(columns, inplace=True, axis=1) phrases = input('What phrases are you filtering for? ') file = file[file.C = phrases] file.to_csv('export.csv')
My question is, how do I filter Column C for multiple phrases? 我的问题是,如何过滤C列中的多个短语? I want the program to take one or more phrases and only show rows where Column C's value equals one of those values. 我希望程序采用一个或多个短语,并且仅显示列C的值等于这些值之一的行。 Any guidance would be amazing. 任何指导将是惊人的。 Thank you!! 谢谢!!
I would just ask for input to be comma separated: 我只是要求输入要以逗号分隔:
phrases = phrases.split(",")
file = file[file.C.isin(phrases)]
maybe this can help you : 也许这可以帮助您:
import csv
input = open(sourcefile.csv, 'rb')
output = open(out_sourcefile, 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
if (phrases you want C column not to be,and you can add here multiple phrases):
continue
writer.writerow(row)
input.close()
output.close()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.