![](/img/trans.png)
[英]How to get value counts for multiple columns at once in Pandas DataFrame?
[英]How to get the highest value row after grouping two columns and getting value counts in Pandas Dataframe?
我使用以下代碼行按兩列分組:
df.groupby('topic')['category'].value_counts()
我得到以下輸出:
topic category
topic1 Entertainment 1303
Science 462
Sports 351
Economy 270
Business 161
Technology 92
Education 40
Politics 18
Environment 5
topic2 Politics 134
Economy 133
Entertainment 110
Sports 69
Business 68
Science 45
Technology 22
Education 7
Environment 2
topic3 Entertainment 1370
Sports 533
Economy 485
Science 335
Business 207
Politics 180
Education 108
Technology 97
Environment 12
我想獲得每個主題(這是最常見的類別)的最上面一行,如下所示:
topic category
topic1 Entertainment 1303
topic2 Politics 134
topic3 Entertainment 1370
在 Pandas 中, value_counts
將按降序對值進行排序,因此您需要做的就是從每個組中取出最高值並返回它。 這可以通過應用函數輕松完成:
def top_value_count(x):
return x.value_counts().head(1)
df.groupby('topic')['category'].apply(top_value_count)
將1
更改為另一個數字以返回每個主題的更多值。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.