[英]Group by multiple columns and pivot and count values from other column in pandas
I have a dataframe我有一个 dataframe
city skills priority acknowledge id_count acknowledge_count
ABC XXX High Yes 11 2
ABC XXX High No 10 3
ABC XXX Med Yes 5 1
ABC YYY Low No 1 5
I want to group by city and skills and get total_id_count from the column id_count , divided into three seperate column from priority as high.med,low.我想按城市和技能分组,并从 id_count 列中获取total_id_count ,从优先级分为三个单独的列,分别为 high.med、low。 SIMILARLY for total_acknowledge_count, take acknowledge类似total_acknowledge_count,接受确认
output required:需要 output:
total_id_count total_acknowledege_count
city,skills High Med Low Yes No
ABC,XXX 22 5 0 3 3 # 22=11+10 3=(2+1)
ABC,YYY 0 0 1 0 5
I am trying different methods like pivot_table, and groupby & stack, but it seems very difficult.我正在尝试不同的方法,例如 pivot_table 和 groupby & stack,但这似乎非常困难。
Is there any way to achieve this result.?有什么办法可以达到这个结果。?
You'll need to pivot separately for the total_id_count
and the total_acknowledege_count
here, since you have two separate column
/ value
schemes for the aggregation:您需要 pivot 分别为total_id_count
和total_acknowledege_count
这里,因为聚合有两个单独的column
/ value
方案:
piv1 = df.pivot_table(index=['city', 'skills'], columns='priority',
values='id_count', aggfunc='sum', fill_value=0)
piv2 = df.pivot_table(index=['city', 'skills'], columns='acknowledge',
values='acknowledge_count', aggfunc='sum', fill_value=0)
piv1.columns = pd.MultiIndex.from_product([['id_count'], piv1.columns])
piv2.columns = pd.MultiIndex.from_product([['acknowledge_count'], piv2.columns])
output = pd.concat([piv1, piv2], axis=1)
print(output)
id_count acknowledge_count
High Low Med No Yes
city skills
ABC XXX 21 0 5 3 3
YYY 0 1 0 5 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.