根據另一列計算值的出現次數

Question

我有一個關於根據其他列的總和創建pandas數據幀的問題。

例如，我有這個數據幀

 Country    |    Accident
 England           Car
 England           Car
 England           Car
  USA              Car
  USA              Bike
  USA              Plane
 Germany           Car
 Thailand          Plane

我想根據國家/地區的所有事故的總和值制作另一個數據框。 我們將忽略事故的類型，同時根據國家總結事故。

我的願望數據框架看起來像這樣

  Country    |    Sum of Accidents
  England              3
    USA                3
  Germany              1
  Thailand             1

Answer 1

選項1
使用value_counts

df.Country.value_counts().reset_index(name='Sum of Accidents')

選項2
然后使用groupby然后size

df.groupby('Country').size().sort_values(ascending=False) \
  .reset_index(name='Sum of Accidents')

Answer 2

您可以使用groupby方法。

示例 -

In [36]: df.groupby(["country"]).count().sort_values(["accident"], ascending=False).rename(columns={"accident" : "Sum of accidents"}).reset_index()
Out[36]:
    country  Sum of accidents
0   England                 3
1       USA                 3
2   Germany                 1
3  Thailand                 1

說明 -

df.groupby(["country"]).                               # Group by country
    count().                                           # Aggregation function which counts the number of occurences of country
    sort_values(                                       # Sorting it 
        ["accident"],                                  
        ascending=False).        
    rename(columns={"accident" : "Sum of accidents"}). # Renaming the columns
    reset_index()                                      # Resetting the index, it takes the country as the index if you don't do this.

根據另一列計算值的出現次數

問題描述

2 個解決方案

解決方案1
6 已采納 2016-09-21 05:28:59

解決方案2
4 2016-09-21 05:02:10

根據另一列計算值的出現次數

問題描述

2 個解決方案

解決方案1 6 已采納 2016-09-21 05:28:59

解決方案2 4 2016-09-21 05:02:10

解決方案1
6 已采納 2016-09-21 05:28:59

解決方案2
4 2016-09-21 05:02:10