繁体   English   中英

如何根据 python 中的列过滤 csv 文件?

[英]How can I filter a csv file based on its columns in python?

我有一个 CSV 文件,其中包含超过 5,000,000 行数据,看起来像这样(除了它是波斯语):

Contract Code,Contract Type,State,City,Property Type,Region,Usage Type,Area,Percentage,Price,Price per m2,Age,Frame Type,Contract Date,Postal Code
765720,Mobayee,East Azar,Kish,Apartment,,Residential,96,100,570000,5937.5,36,Metal,13890107,5169614658
766134,Mobayee,East Azar,Qeshm,Apartment,,Residential,144.5,100,1070000,7404.84,5,Concrete,13890108,5166884645
766140,Mobayee,East Azar,Tabriz,Apartment,,Residential,144.5,100,1050000,7266.44,5,Concrete,13890108,5166884645
766146,Mobayee,East Azar,Tabriz,Apartment,,Residential,144.5,100,700000,4844.29,5,Concrete,13890108,5166884645
766147,Mobayee,East Azar,Kish,Apartment,,Residential,144.5,100,1625000,11245.67,5,Concrete,13890108,5166884645
770822,Mobayee,East Azar,Tabriz,Apartment,,Residential,144.5,50,500000,1730.1,5,Concrete,13890114,5166884645

我想编写一个代码将第一行作为 header 传递,然后从两个特定城市(Kish 和 Qeshm)提取数据并将其保存到一个新的 CSV 文件中。 像这样的东西:

Contract Code,Contract Type,State,City,Property Type,Region,Usage Type,Area,Percentage,Price,Price per m2,Age,Frame Type,Contract Date,Postal Code
765720,Mobayee,East Azar,Kish,Apartment,,Residential,96,100,570000,5937.5,36,Metal,13890107,5169614658
766134,Mobayee,East Azar,Qeshm,Apartment,,Residential,144.5,100,1070000,7404.84,5,Concrete,13890108,5166884645
766147,Mobayee,East Azar,Kish,Apartment,,Residential,144.5,100,1625000,11245.67,5,Concrete,13890108,5166884645

值得一提的是,我是 python 的新手。我已经编写了以下块来定义标头,但这是迄今为止我得到的最远的。

import pandas as pd

path = '/Users/Desktop/sample.csv'

df = pd.read_csv(path , header=[0])
df.head = ()

您不需要使用header=...因为默认将第一行视为 header,所以

df = pd.read_csv(path)

然后,根据条件保留行:

df2 = df[df['City'].isin(['Kish', 'Qeshm'])]

你可以保存它

df2.to_csv(another_path)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM