[英]Group by and count of other column values pandas
I have a pandas dataframe我有一个 pandas dataframe
age gender criticality acknowledged
10 Male High Yes
10 Male High Yes
10 Male High Yes
10 Male Low Yes
11 Female Medium No
I want to groupby age and gender, and then put values of 'criticality', 'acknowledged' as new columns and get the count.我想按年龄和性别分组,然后将“关键性”、“已确认”的值作为新列并获取计数。
For eg output i desire is:例如 output 我希望是:
criticality acknowledged
age gender High Medium Low Yes No
10 Male 3 0 1 4 0
11 Female 0 1 0 0 1
I thought of using df.groupby(['age','gender'])['criticality','acknowledged'].stack()
我想过使用df.groupby(['age','gender'])['criticality','acknowledged'].stack()
But its not working.但它不起作用。
Is there a better way to get results in this format有没有更好的方法来获得这种格式的结果
Since you are counting for the two columns separately, a concat would be an easy solution:由于您分别计算两列,因此 concat 将是一个简单的解决方案:
In [13]: pd.concat([df.pivot_table(index=['age', 'gender'], columns=col, aggfunc
...: =len) for col in ['criticality', 'acknowledged']], axis=1).fillna(0)
Out[13]:
acknowledged criticality
criticality High Low Medium No Yes
age gender
10 Male 3.0 1.0 0.0 0.0 4.0
11 Female 0.0 0.0 1.0 1.0 0.0
Another way using get_dummies()
with groupby()
after assigning
and finally split the columns with expand=True
for Multiindex:在assigning
后使用get_dummies()
和groupby()
的另一种方法,最后使用expand=True
拆分列以用于 Multiindex:
l=['criticality','acknowledged']
final=df[['age','gender']].assign(**pd.get_dummies(df[l])).groupby(['age','gender']).sum()
final.columns=final.columns.str.split('_',expand=True)
print(final)
criticality acknowledged
High Low Medium No Yes
age gender
10 Male 3 1 0 0 4
11 Female 0 0 1 1 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.