![](/img/trans.png)
[英]Pandas dataframe: Group by two columns and then average over another column
[英]Pandas dataframe: Group by two columns and then average the third column
我有一個數據框,看起來像下面的示例:
year x y
2016 o 227
2018 o 214
2016 o 56
2018 o 62
2018 o 87
2019 o 40
2017 r 15
2016 i 14
2016 o 88
2014 o 48
我想要得到一個輸出,其中基於年份的grouby
計算y的平均值,然后進一步計算x
。 像這樣:
year x y
2016 o (227 + 56 + 88)/3 = 123.66 = 124 (Need just the final value)
2018 o (214 + 62 + 87)/3 = 121 (Need just the final value)
2019 o 40
2017 r 15
2016 i 14
2014 o 48
我想我找到了一種方法(但我可能錯了),但是結果出在非數據幀中:
print(part_b[['year', 'x', 'y']].groupby(['year', 'x']).mean())
生成的輸出:(以下輸出的結果來自我的全部數據)
y
year x
2014 o 48.000000
2016 i 14.000000
o 117.000000
2017 o 71.000000
r 27.500000
2018 i 23.000000
o 97.428571
2019 i 11.000000
o 115.500000
'''
Whereas I would like to have this:
```python
year x y
2014 o 48
2016 i 14
2016 o 117
2017 o 71
2017 r 28
2018 i 23
2018 o 97
2019 i 11
2019 o 116
'''
鑒於這種:
year category amount
0 2015 A 200
1 2015 B 1000
2 2015 A 300
3 2016 C 1200
4 2016 A 800
5 2016 A 2500
6 2016 B 100
這樣做:
df.groupby(['year','category'])['amount'].mean()
會給你:
year category
2015 A 250
B 1000
2016 A 1650
B 100
C 1200
Name: amount, dtype: int64
要實現您所需要的,只需執行以下操作:
df.groupby(['year','category'])['amount'].mean().reset_index()
year category amount
0 2015 A 250
1 2015 B 1000
2 2016 A 1650
3 2016 B 100
4 2016 C 1200
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.