简体   繁体   English

基于两列计算唯一值的出现

[英]Count the occurance of unique values based on two columns

I'm trying to count the number of times a number (Knumber) occurs for each of the categories (category), below is my sample data.我正在尝试计算每个类别(类别)出现数字(Knumber)的次数,下面是我的示例数据。

Knumber category
K9  red
K1  white
K1  white
K9  white
K6  blue

I'm attempting make it into the following using pandas.我正在尝试使用熊猫将其变为以下内容。

Knumber category    count
K9  red 1
K1  white   2
K9  white   1
K6  blue    1

I've fiddled around with value.counts using df['Knumber'].value_counts() but obviously that only counts Knumbers, can you please help me bring my other column 'category' into the equation?我已经使用df['Knumber'].value_counts()摆弄了 value.counts,但显然只计算 Knumbers,你能帮我把我的另一列“类别”带入等式吗?

Use Pandas groupby and the size function to get the count.使用 Pandas groupby 和 size 函数来获取计数。 The agg method allows us to pass a name for the aggregated column. agg 方法允许我们为聚合列传递一个名称。

 (df
.groupby(['Knumber','category'])
.agg(count= ('category','size'))
.reset_index()
 )


   Knumber  category    count
0   K1  white   2
1   K6  blue    1
2   K9  red 1
3   K9  white   1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM