[英]how to find percentage of total in groupby in pandas
I have following dataframe in pandas 我在熊猫中有以下数据框
Date tank hose quantity count set flow
01-01-2018 1 1 20 100 211 12.32
01-01-2018 1 2 20 200 111 22.32
01-01-2018 1 3 20 200 123 42.32
02-01-2018 1 1 10 100 211 12.32
02-01-2018 1 2 10 200 111 22.32
02-01-2018 1 3 10 200 123 42.32
I want to calculate percentage of quantity
and count
grouping by Date
and tank
. 我想计算的百分比quantity
和count
通过分组Date
和tank
。 My desired dataframe 我想要的数据框
Date tank hose quantity count set flow perc_quant perc_count
01-01-2018 1 1 20 100 211 12.32 33.33 20
01-01-2018 1 2 20 200 111 22.32 33.33 40
01-01-2018 1 3 20 200 123 42.32 33.33 40
02-01-2018 1 1 10 100 211 12.32 25 20
02-01-2018 1 2 20 200 111 22.32 50 40
02-01-2018 1 3 10 200 123 42.32 25 40
I am doing following to achieve this 我正在努力实现这一目标
test = df.groupby(['Date','tank']).apply(lambda x:
100 * x / float(x.sum()))
Use GroupBy.transform
with lambda function, add_prefix
and join
to original: 将GroupBy.transform
与lambda函数一起使用, add_prefix
并join
原始文件:
f = lambda x: 100 * x / float(x.sum())
df = df.join(df.groupby(['Date','tank'])['quantity','count'].transform(f).add_prefix('perc_'))
Or specify new columns names: 或指定新的列名称:
df[['perc_quantity','perc_count']] = (df.groupby(['Date','tank'])['quantity','count']
.transform(f))
print (df)
Date tank hose quantity count set flow perc_quantity \
0 01-01-2018 1 1 20 100 211 12.32 33.333333
1 01-01-2018 1 2 20 200 111 22.32 33.333333
2 01-01-2018 1 3 20 200 123 42.32 33.333333
3 02-01-2018 1 1 10 100 211 12.32 33.333333
4 02-01-2018 1 2 10 200 111 22.32 33.333333
5 02-01-2018 1 3 10 200 123 42.32 33.333333
perc_count
0 20.0
1 40.0
2 40.0
3 20.0
4 40.0
5 40.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.