Python pandas：在兩列上分組並創建新變量

Question

我有以下數據框架描述公司中某類投資者持有的股份百分比：

    company  investor   pct 
       1       A         1
       1       A         2
       1       B         4
       2       A         2
       2       A         4
       2       A         6 
       2       C         10
       2       C         8

我想為每個投資者類型創建一個新列，計算每個公司持有的股票的平均值。 我還需要保持數據集的相同長度，例如使用transform。

這是我想要的結果：

     company  investor   pct   pct_mean_A   pct_mean_B   pct_mean_C
       1       A         1        1.5          4            0
       1       A         2        1.5          4            0
       1       B         4        1.5          4            0
       2       A         2        4.0          0            9
       2       A         4        4.0          0            9
       2       A         6        4.0          0            9
       2       C         10       4.0          0            9
       2       C         8        4.0          0            9

非常感謝你的幫助！

Answer 1

使用groupby與總mean和重塑unstack的幫手DataFrame是join到原來的df ：

s = (df.groupby(['company','investor'])['pct']
       .mean()
       .unstack(fill_value=0)
       .add_prefix('pct_mean_'))

df = df.join(s, 'company')
print (df)
   company investor  pct  pct_mean_A  pct_mean_B  pct_mean_C
0        1        A    1         1.5         4.0         0.0
1        1        A    2         1.5         4.0         0.0
2        1        B    4         1.5         4.0         0.0
3        2        A    2         4.0         0.0         9.0
4        2        A    4         4.0         0.0         9.0
5        2        A    6         4.0         0.0         9.0
6        2        C   10         4.0         0.0         9.0
7        2        C    8         4.0         0.0         9.0

或者使用帶有默認聚合函數的pivot_table mean ：

s = df.pivot_table(index='company',
                   columns='investor',
                   values='pct', 
                   fill_value=0).add_prefix('pct_mean_')
df = df.join(s, 'company')
print (df)
   company investor  pct  pct_mean_A  pct_mean_B  pct_mean_C
0        1        A    1         1.5           4           0
1        1        A    2         1.5           4           0
2        1        B    4         1.5           4           0
3        2        A    2         4.0           0           9
4        2        A    4         4.0           0           9
5        2        A    6         4.0           0           9
6        2        C   10         4.0           0           9
7        2        C    8         4.0           0           9

Python pandas：在兩列上分組並創建新變量

問題描述

1 個解決方案

解決方案1
1 已采納 2018-08-23 11:24:10

Python pandas：在兩列上分組並創建新變量

問題描述

1 個解決方案

解決方案1 1 已采納 2018-08-23 11:24:10

解決方案1
1 已采納 2018-08-23 11:24:10