Pandas GroupBy 值的頻率

Question

我有這組樣本數據

STATE   CAPSULES     LIQUID         TABLETS  
Alabama NaN          Prescription   OTC
Georgia Prescription NaN            OTC
Texas   OTC          OTC            NaN
Texas   Prescription NaN            NaN
Florida NaN          Prescription   OTC
Georgia OTC          Prescription   Prescription
Texas   Prescription NaN            OTC
Alabama NaN          OTC            OTC
Georgia OTC          NaN            NaN

我嘗試了多個 groupby 配置以獲得以下理想結果：

State   capsules_OTC    capsules_prescription   liquid_OTC  liquid_prescription tablets_OTC tablets_prescription
Alabama    0             0                         0              0               0           0
Florida    0             0                         0              0               0           0
Georgia    1             1                         1              1               1           1
Texas      1             2                         2              2               2           2

例如，試過這個

df.groupby(['STATE','CAPSULES'])

嘗試至少讓第一列發生爭執，沒有骰子。 也許這不是一個簡單的答案，但我想我遺漏了一些簡單的 groupby 和 count() 或其他一些應用函數？

Answer 1

將pd.get_dummies與groupby和sum ：

pd.get_dummies(df, columns=['CAPSULES', 'LIQUID', 'TABLETS'])\
  .groupby('STATE', as_index=False).sum()

輸出：

     STATE  CAPSULES_OTC  CAPSULES_Prescription  LIQUID_OTC  LIQUID_Prescription  TABLETS_OTC  TABLETS_Prescription
0  Alabama             0                      0           1                    1            2                     0
1  Florida             0                      0           0                    1            1                     0
2  Georgia             2                      1           0                    1            1                     1
3    Texas             1                      2           1                    0            1                     0

Pandas GroupBy 值的頻率

問題描述

1 個解決方案

解決方案1
4 已采納 2020-10-28 00:44:31

Pandas GroupBy 值的頻率

問題描述

1 個解決方案

解決方案1 4 已采納 2020-10-28 00:44:31

解決方案1
4 已采納 2020-10-28 00:44:31