[英]Groupby multiple column to find the unique count of one column using python pandas
I have dataframe like:我有 dataframe 像:
column1 column2 column3
ram tall good
rohan short fine
ajay tall best
alia tall good
aman medium fine
john short good
jack short fine
now i need output like:现在我需要 output 像:
unique count of good in tall, short, medium on basis of column1->基于 column1-> 的高、短、中的唯一计数
tall=2 , short=1 , medium=0
unique count of fine in tall, short, medium on basis of column1->基于 column1-> 的高、短、中的唯一罚款计数
tall=0 , short=2 , medium=1
unique count of best in tall, short, medium on basis of column1->基于 column1-> 的高、短、中最佳的唯一计数
tall=1 , short=0 , medium=0
I am beginner in pandas.我是 pandas 的初学者。 Thanks in advance
提前致谢
Let's try pd.crosstab
:让我们试试
pd.crosstab
:
pd.crosstab(df['column3'], df['column2'])
column2 medium short tall
column3
best 0 0 1
fine 1 2 0
good 0 1 2
Use value_counts + unstack使用value_counts + unstack
res = df[['column3', 'column2']].value_counts().unstack('column2', fill_value=0)
print(res)
Output Output
column2 medium short tall
column3
best 0 0 1
fine 1 2 0
good 0 1 2
As an alternative groupby + unstack :作为替代groupby + unstack :
res = df.groupby(['column3', 'column2']).count().unstack('column2', fill_value=0)
print(res)
Output (groupby) Output (groupby)
column1
column2 medium short tall
column3
best 0 0 1
fine 1 2 0
good 0 1 2
The idea behind both approaches is to create an index and then unstack it.这两种方法背后的想法是创建一个索引,然后将其拆开。 If you want to match the same order as specify in your question, convert to Categorical first:
如果您想匹配问题中指定的相同顺序,请先转换为分类:
df['column2'] = pd.Categorical(df['column2'], categories=['tall', 'short', 'medium'], ordered=True)
res = df[['column3', 'column2']].value_counts().unstack('column2', fill_value=0)
print(res)
Output Output
column2 tall short medium
column3
best 1 0 0
fine 0 2 1
good 2 1 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.