[英]pandas, How to add rows with average grouped columns
我有一个如下所示的数据框。
我想为每个水果添加 1 行,其中
price
应设置为该水果先前存在的行的平均价格。resource
将始终为all
。ftype
将始终为avg
。 我知道如何生成一个新列,显示每个水果的平均价格,但我不知道如何用这个平均值添加一行。
你能帮助我吗?
import numpy as np
import pandas as pd
fruit = ['apple','apple','banana','banana','kiwi','kiwi','grape','grape']
ftype = ['one','two','one','two','three','one','one','two']
resource = ['us','us','us','us','us','us','us','us']
price = [100,150,200,300,120,300,400,500]
df = pd.DataFrame({'fruit':fruit,'ftype':ftype,'resource':resource,'price':price})
print(df)
原始数据框:
fruit ftype price resource
0 apple one 100 us
1 apple two 150 us
2 banana one 200 us
3 banana two 300 us
4 kiwi three 120 us
5 kiwi one 300 us
6 grape one 400 us
7 grape two 500 us
我想生成什么:
fruit ftype price resource
0 apple one 100 us
1 apple two 150 us
apple avg 125 all
2 banana one 200 us
3 banana two 300 us
banana avg 250 all
4 kiwi three 120 us
5 kiwi one 300 us
kiwi avg 210 all
6 grape one 400 us
7 grape two 500 us
grape avg 450 all
您可以使用DataFrame.assign
聚合mean
并添加新列:
df1 = df.groupby('fruit', as_index=False)['price'].mean().assign(resource='all',ftype='avg')
然后使用concat
和排序值:
df = (pd.concat([df, df1], sort=True)
.sort_values(['fruit','resource'], ascending=[True, False])
.reset_index(drop=True))
print (df)
fruit ftype price resource
0 apple one 100 us
1 apple two 150 us
2 apple avg 125 all
3 banana one 200 us
4 banana two 300 us
5 banana avg 250 all
6 grape one 400 us
7 grape two 500 us
8 grape avg 450 all
9 kiwi three 120 us
10 kiwi one 300 us
11 kiwi avg 210 all
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.