简体   繁体   中英

How to fill column based on the condition in dataframe?

I am trying to fill records one column based on some condition but I am not getting the result. Can you please help me how to do this?

Example:

df:

applied_sql_function1     and_or_not_oprtor_pre    comb_fld_order_1
 CASE WHEN                                    
 WHEN                      AND                     
 WHEN                      AND                          
 WHEN                      
 WHEN                      AND
 WHEN                      OR                      
 WHEN  
 WHEN                                                 dummy
 WHEN                                                 dummy
 WHEN

Expected Output:

applied_sql_function1     and_or_not_oprtor_pre    comb_fld_order_1     new
 CASE WHEN                                                              CASE WHEN
 WHEN                      AND                                      
 WHEN                      AND                          
 WHEN                                                                   WHEN
 WHEN                      AND
 WHEN                      OR                      
 WHEN                                                                   WHEN
 WHEN                                                 dummy
 WHEN                                                 dummy
 WHEN                                                                   WHEN

I have written some logic for this but it is not working:

            df_main1['new'] =''
            for index,row in df_main1.iterrows():
                new = ''
                if((str(row['applied_sql_function1']) != '') and (str(row['and_or_not_oprtor_pre']) == '') and (str(row['comb_fld_order_1']) == '')):
                    new += str(row['applied_sql_function1'])
                    print(new)

                if(str(row['applied_sql_function1']) != '') and (str(row['and_or_not_oprtor_pre']) != ''):
                    new += ''
                    print(new)

                else:
                    new += ''

                row['new'] = new

            print(df_main1['new'])

Go with np.where all the way, It's easy to understand and vectorized. so the performance is good on really large datasets.

import pandas as pd, numpy as np
df['new'] = ''
df['new'] = np.where((df['and_or_not_oprtor_pre'] == '') & (df['comb_fld_order_1'] == ''), df['applied_sql_function1'], df['new'])
df

Using, loc

mask = df.and_or_not_oprtor_pre.fillna("").eq("") \
       & df.comb_fld_order_1.fillna("").eq("")

df.loc[mask, 'new'] = df.loc[mask, 'applied_sql_function1']

try this one, it would work in a quick way

indexes = df.index[(df['and_or_not_oprtor_pre'].isna()) & (df['comb_fld_order_1'].isna())]
df.loc[indexes, 'new'] = df.loc[indexes, 'applied_sql_function1']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM