简体   繁体   中英

pandas df apply condition on multiple columns

I have a df that looks like this:


I want to create a new column named promoted from the columns master_feature, epic, and feature:

value of promoted will be:

  • master feature if adjacent master_feature column value is not null.
  • feature if adjacent feature column value is not null,and likewise for epic

something like:

df.promoted = 'master feature' if not pd.isnull(df.master_feature) 
               elseif 'feature' if not pd.isnull(df.feature) 
               elseif 'epic' pd.isnull(df.epic) 
               else 'Na' 

how can I achieve this using a df.apply ? is it much more efficient if I use np.select ?

np.select is the way to go. Try below. . . I think I got the logic correct based on your question. Also, there is some discrepancy in your logic: "feature if adjacent feature column value is not null,and likewise for epic" is not the same as "elseif 'epic' pd.isnull(df.epic)" So I went with if df['epic'] is not null then 'epic' Let me know if that is correct.

cond = [~df['master_feature'].isna(), # if master_feater is not null then 'master feater'
        ~df['feature'].isna(), # if feature is not null then 'feature
        ~df['epic'].isna()] # if epic is not null then 'epic'

choice = ['master feature', 

df['promoted'] = np.select(cond, choice, np.nan)

  master_feature feature epic        promoted
0             ab     NaN  NaN  master feature
1            NaN     NaN   fg            epic
2            NaN      pq  NaN         feature

It can be done with combination of apply() and numpy argmin()

df = pd.DataFrame.from_dict({'master_feature':['ab',float('NaN'),float('NaN')],

df.assign(promoted=lambda dfa: dfa.apply(lambda r: r[np.argmin(r.isna())], axis=1))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM