[英]How to get value counts for multiple columns at once in Pandas DataFrame?
[英]How to get the highest value row after grouping two columns and getting value counts in Pandas Dataframe?
我使用以下代码行按两列分组:
df.groupby('topic')['category'].value_counts()
我得到以下输出:
topic category
topic1 Entertainment 1303
Science 462
Sports 351
Economy 270
Business 161
Technology 92
Education 40
Politics 18
Environment 5
topic2 Politics 134
Economy 133
Entertainment 110
Sports 69
Business 68
Science 45
Technology 22
Education 7
Environment 2
topic3 Entertainment 1370
Sports 533
Economy 485
Science 335
Business 207
Politics 180
Education 108
Technology 97
Environment 12
我想获得每个主题(这是最常见的类别)的最上面一行,如下所示:
topic category
topic1 Entertainment 1303
topic2 Politics 134
topic3 Entertainment 1370
在 Pandas 中, value_counts
将按降序对值进行排序,因此您需要做的就是从每个组中取出最高值并返回它。 这可以通过应用函数轻松完成:
def top_value_count(x):
return x.value_counts().head(1)
df.groupby('topic')['category'].apply(top_value_count)
将1
更改为另一个数字以返回每个主题的更多值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.