Pandas: create new column with group means conditional on another column

Question

I am trying to create a new column containing group means conditional on the values of another column. This is best explained by example:

df = pd.DataFrame({'A': [59000000, 65000000, 434000, 434000, 434000, 337000, 11300, 11300, 11300],
                   'B': [1, 1 , 0, 1, 0, 0, 1, 1, 0],
                   'group': ["IT", "IT", "IT", "MV", "MV", "MV", "IT", "MV", "MV"]})

df

          A  B group
0  59000000  1    IT
1  65000000  1    IT
2    434000  0    IT
3    434000  1    MV
4    434000  0    MV
5    337000  0    MV
6     11300  1    IT
7     11300  1    MV
8     11300  0    MV

I've managed to solve the problem but I am looking for something with less lines of code and possibly more efficient.

x = df.loc[df['B']==1].groupby('group', as_index=False)['A'].mean()
x.rename(columns = {'A':'a'}, inplace = True)
df = pd.merge(df, x, how='left', on='group')

          A  B group         a
0  59000000  1    IT  41337100
1  65000000  1    IT  41337100
2    434000  0    IT  41337100
3    434000  1    MV    222650
4    434000  0    MV    222650
5    337000  0    MV    222650
6     11300  1    IT  41337100
7     11300  1    MV    222650
8     11300  0    MV    222650

I've tried using the transform function but its not working for me

df.loc[: , 'a'] = df.groupby('group').transform(lambda x: x[x['B']==1]['A'].mean())

Answer 1

Use Series.where to filter only the values of col A you need, then groupby and transform :

df['a'] = df['A'].where(df['B'].eq(1)).groupby(df['group']).transform('mean')

[out]

          A  B group           a
0  59000000  1    IT  41337100.0
1  65000000  1    IT  41337100.0
2    434000  0    IT  41337100.0
3    434000  1    MV    222650.0
4    434000  0    MV    222650.0
5    337000  0    MV    222650.0
6     11300  1    IT  41337100.0
7     11300  1    MV    222650.0
8     11300  0    MV    222650.0

Pandas: create new column with group means conditional on another column

Question

1 answers

solution1
4 ACCPTED 2020-03-10 10:20:36

Pandas: create new column with group means conditional on another column

Question

1 answers

solution1 4 ACCPTED 2020-03-10 10:20:36

solution1
4 ACCPTED 2020-03-10 10:20:36