简体   繁体   English

在 Pandas 中分组查找中位数

[英]Group by in Pandas to find Median

I have a properties dataset.我有一个属性数据集。 I would like to know the median price according to several attributes as follows: suburb , rooms , bathrooms , type , car , age .我想根据以下几个属性知道中位数价格: suburbroomsbathroomstypecarage Then I want to add a new boolean column to state if the property is overpriced or not.然后,如果该物业价格过高,我想在 state 中添加一个新的 boolean 列。

sample of my dataframe(the original dataframe has 180 suburbs):我的数据框样本(原始 dataframe 有 180 个郊区):

house=pd.DataFrame({'subrub':['BALWYN NORTH','ARMADALE','ARMADALE','PASCOE VALE'],
                 'price':[1350000.0,800000.0,1250000.0,680000.0],
                'rooms':[3,4,7,2],
                'bathroom':[1.0,2.0,4.0,1.0],
                'type':['h','t','t','u'],
                'car':['2.0','1.0','4.0','1.0'],
                'age':[59.0,69.0,12.0,14.0]})

So far I have grouped by suburbs.到目前为止,我已按郊区分组。 I know I can use median to find the median, but I am not sure how to approach the other attributes.我知道我可以使用median来找到中位数,但我不确定如何处理其他属性。 Any tip would be helpful.任何提示都会有所帮助。 Thank you,谢谢,

Like this像这样

def over_price(elements):
    median = np.median(elements)
    return elements > median 

house["OverPrice"] = house.groupby(["subrub","rooms","bathroom","type","car","age"])["price"].transform(over_price)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM