简体   繁体   中英

Set new column values in pandas DataFrame1 where DF2 column values match DF1 index

I'd like to set a new column in a pandas dataframe with values calculated using a groupby on dataframe2.

DF1:

     col1    col2
id    
 1    'a'
 2    'b'
 3    'c'

DF2:

          id    col2
 index    
     1     1      11
     1     1      22
     1     1      12
     1     1      45
     3     3      83
     3     3      11
     3     3      35
     3     3      54

I want to group DF2 by 'id', and then apply a function on 'col2' to put the result into the corresponding index in DF1. If there is no group for that particular index, then I want to fill with NaN...

ret_val = DF2.groupby('id').apply(lambda x: my_func(x['col_2']))

     col1    col2
id    
 1    'a'    ret_val
 2    'b'    NaN
 3    'c'    ret_val

... I can't quite figure out how to achieve this though

Use map on df1.index series.

In [5327]: df1['col2'] = df1.index.to_series().map(df2.groupby('id')
                                                      .apply(lambda x: my_func(x['col2'])))

In [5328]: df1
Out[5328]:
   col1   col2
id
1     a  360.0
2     b    NaN
3     c  536.0

Details

In [5322]: def my_func(x):
      ...:     return x.sum()
      ...:

In [5323]: df2.groupby('id').apply(lambda x: my_func(x['col2']))
Out[5323]:
id
1    360.0
3    536.0
dtype: float64

In [5324]: df1.index.to_series().map(df2.groupby('id').apply(lambda x: my_func(x['col2'])))
Out[5324]:
id
1    360.0
2      NaN
3    536.0
Name: id, dtype: float64

Apply the function on col2 of df2 first then use pd.concat droping the col2 in df since it is empty.

x = df2.groupby('id')['col2'].apply(sum) # instead of sum use your own function
ndf = pd.concat([df.drop('col2',1),x],1)
col1   col2
id            
1   'a'   90.0
2   'b'    NaN
3   'c'  183.0

Straight and simple suggested by @Zero

df1['col2'] = df2.groupby('id')['col2'].apply(sum)

you can replace sum with .apply(lambda x : your_func(x))

df1.col2=df.set_index('id').groupby(level='id').sum()
df1
Out[975]: 
   col1   col2
id            
1   'a'   90.0
2   'b'    NaN
3   'c'  183.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM