如何通過groupby pandas python添加多列

Question

假設我有一個數據框：

date | brand | color
--------------------
2017 | BMW   | red
2017 | GM    | blue
2017 | BMW   | blue
2017 | BMW   | red
2018 | BMW   | green
2018 | GM    | blue
2018 | GM    | blue
2018 | GM    | red

結果，我想要一個類似的東西：

date | brand | red | blue | green
---------------------------------
2017 | BMW   |  2  |  1   |   0
     |  GM   |  0  |  1   |   0
2018 | BMW   |  0  |  0   |   1
     |  GM   |  1  |  2   |   0

我發現我需要使用groupby + size，例如：

df[df['color'] == 'red'].groupby([df['date'], df['brand']]).size()

但這使我的Series僅適用於單色，而我希望具有完整的數據框，如上圖所示。

Answer 1

就像您看到的那樣簡單。

選項1 crosstab

pd.crosstab([df['date'],df['brand']], df['color'])
Out[30]: 
 color          blue   green   red
date   brand                      
2017   BMW         1       0     2
       GM          1       0     0
2018   BMW         0       1     0
       GM          2       0     1

選項2： groupby和unstack

df.groupby(['date ',' brand ',' color'])[' color'].count().unstack(-1).fillna(0)
Out[40]: 
 color          blue   green   red
date   brand                      
2017   BMW       1.0     0.0   2.0
       GM        1.0     0.0   0.0
2018   BMW       0.0     1.0   0.0
       GM        2.0     0.0   1.0

選項3 pivot_table

pd.pivot_table(df.reset_index(),index=['date','brand'],columns='color',values='index',aggfunc='count').fillna(0)
Out[57]: 
color          blue   green   red
date brand                       
2017  BMW       1.0     0.0   2.0
      GM        1.0     0.0   0.0
2018  BMW       0.0     1.0   0.0
      GM        2.0     0.0   1.0

Answer 2

df.groupby(['date','brand'])['red','blue','green'].count()

要么...

df.groupby(['date','brand']).agg('count')

如何通過groupby pandas python添加多列

問題描述

2 個解決方案

解決方案1
5 已采納 2017-09-30 22:21:57

解決方案2
0 2017-09-30 21:44:59

如何通過groupby pandas python添加多列

問題描述

2 個解決方案

解決方案1 5 已采納 2017-09-30 22:21:57

解決方案2 0 2017-09-30 21:44:59

解決方案1
5 已采納 2017-09-30 22:21:57

解決方案2
0 2017-09-30 21:44:59