[英]Working with data from CSV with Python without using Pandas
[英]filter csv data using python without pandas
我想根据公寓类型和转售价格的平均值(即room5)进行过滤,我正在尝试根据5房公寓的转售价格值进行过滤。 但是,当我尝试“打印”room5 时,它是空列表。 我错过了哪个部分?
[![带打印数据的代码][1]][1]
获得房间
room5 = roomprice2013[ roomprice2013['flat_type'] == '5-room' ]
print(room5)
得到总和
room5_sum = roomprice2013[ roomprice2013['flat_type'] == '5-room' ]['price'].sum()
print(room5_sum)
工作示例。
我使用io.StringIO
来模拟文件,这样每个人都可以复制和运行示例,而无需下载带有数据的文件。
import numpy as np
import io
s = io.StringIO('''quarter,town,flat_type,price
2013-Q2,Bedok,1-room,na
2013-Q2,Bedok,2-room,-
2013-Q2,Bedok,3-room,172000
2013-Q2,Bedok,4-room,224500
2013-Q2,Bedok,5-room,332000
2013-Q2,Bedok,Executive,420000''')
data = np.genfromtxt(s, skip_header=1, dtype=[('quarter','U10'),('town','U20'), ('flat_type','U10'), ('price','i8')], delimiter =',', missing_values=['na','-'],filling_values='0')
data_2013 = data[np.isin(data['quarter'], ['2013-Q1','2013-Q2','2013-Q3','2013-Q4'],['flat_type'])]
print(data_2013)
roomprice2013 = data_2013[['flat_type','price']]
print(roomprice2013)
room5 = roomprice2013[roomprice2013['flat_type'] == '5-room']
print(room5)
room5_sum = roomprice2013[roomprice2013['flat_type'] == '5-room']['price'].sum()
print(room5_sum)
编辑:内部roomprice2013['flat_type'] == '5-room'
只提供包含True/False
列表,您可以使用(甚至多次)只保留需要的行。
mask = (roomprice2013['flat_type'] == '5-room') # it works without () but it is more readable with ()
print(mask)
# [False False False False True False]
room5 = roomprice2013[mask]
print(room5)
room5_sum = roomprice2013[mask]['price'].sum()
print(room5_sum)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.