繁体   English   中英

使用没有熊猫的python过滤csv数据

[英]filter csv data using python without pandas

我想根据公寓类型和转售价格的平均值(即room5)进行过滤,我正在尝试根据5房公寓的转售价格值进行过滤。 但是,当我尝试“打印”room5 时,它是空列表。 我错过了哪个部分?

[![带打印数据的代码][1]][1]

获得房间

room5 = roomprice2013[ roomprice2013['flat_type'] == '5-room' ]
print(room5)

得到总和

room5_sum = roomprice2013[ roomprice2013['flat_type'] == '5-room' ]['price'].sum()
print(room5_sum)

工作示例。

我使用io.StringIO来模拟文件,这样每个人都可以复制和运行示例,而无需下载带有数据的文件。

import numpy as np
import io

s = io.StringIO('''quarter,town,flat_type,price
2013-Q2,Bedok,1-room,na
2013-Q2,Bedok,2-room,-
2013-Q2,Bedok,3-room,172000
2013-Q2,Bedok,4-room,224500
2013-Q2,Bedok,5-room,332000
2013-Q2,Bedok,Executive,420000''')

data = np.genfromtxt(s, skip_header=1, dtype=[('quarter','U10'),('town','U20'), ('flat_type','U10'), ('price','i8')], delimiter =',', missing_values=['na','-'],filling_values='0')

data_2013 = data[np.isin(data['quarter'], ['2013-Q1','2013-Q2','2013-Q3','2013-Q4'],['flat_type'])]
print(data_2013)

roomprice2013 = data_2013[['flat_type','price']]
print(roomprice2013)

room5 = roomprice2013[roomprice2013['flat_type'] == '5-room']
print(room5)

room5_sum = roomprice2013[roomprice2013['flat_type'] == '5-room']['price'].sum()
print(room5_sum)

编辑:内部roomprice2013['flat_type'] == '5-room'只提供包含True/False列表,您可以使用(甚至多次)只保留需要的行。

mask = (roomprice2013['flat_type'] == '5-room')  # it works without () but it is more readable with ()
print(mask) 
# [False False False False  True False]

room5 = roomprice2013[mask]
print(room5)

room5_sum = roomprice2013[mask]['price'].sum()
print(room5_sum)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM