简体   繁体   中英

How do I create a new col3 in a dataframe that checks for multiple values if they are in col1 and also check for values in col2

Existing Dataframe and Desired Result either Pandas or NumPy: contactid, bonustype, bonusreceived, NEW_COLUMN

contactid     bonustype     bonusreceived      NEW_COLUMN
100           a             yes                ab
100           b             no                 NULL
200           a             no                 NULL             
200           b             yes                abc
200           c             yes                abc

I have to check from bonustype if both values (a,b) are true and bonusreceived is 'yes' for contactid then return (ab) in NEW_COLUMN. If all three bonustype (a, b, c) and bonusreceived is 'yes' then return (abc) in NEW_COLUMN.

I have tried several tricks but not able to get the above desired result. Any help will highly be appreciated.

Thanks

With the clarified requirement that

  1. for every contactid , each bonustype should be used only once in the aggregated text in NEW_COLUMN
  2. for bonusreceived == 'no', the corresponding NEW_COLUMN should be NULL

We can use .groupby() + transform() and join the unique text of bonustype . Then, use np.where() to ensure only when bonusreceived == 'yes' we get the aggregated text and set NaN otherwise.

import numpy as np

df['NEW_COLUMN'] = np.where(df['bonusreceived'] == 'yes', 
                            df.groupby('contactid')['bonustype'].transform(lambda x: ''.join(x.unique())),
                            np.nan)

Data Input

print(df)

   contactid bonustype bonusreceived
0        100         a           yes
1        100         b            no
2        200         a            no
3        200         b           yes
4        200         c           yes
5        100         a            no
6        200         a           yes

Result:

print(df)

   contactid bonustype bonusreceived NEW_COLUMN
0        100         a           yes         ab
1        100         b            no        NaN
2        200         a            no        NaN
3        200         b           yes        abc
4        200         c           yes        abc
5        100         a            no        NaN
6        200         a           yes        abc

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM