繁体   English   中英

如何根据 dataframe 中的条件填充列?

[英]How to fill column based on the condition in dataframe?

我正在尝试根据某些条件填写一列记录,但我没有得到结果。 你能帮我怎么做吗?

例子:

东风:

applied_sql_function1     and_or_not_oprtor_pre    comb_fld_order_1
 CASE WHEN                                    
 WHEN                      AND                     
 WHEN                      AND                          
 WHEN                      
 WHEN                      AND
 WHEN                      OR                      
 WHEN  
 WHEN                                                 dummy
 WHEN                                                 dummy
 WHEN

预期 Output:

applied_sql_function1     and_or_not_oprtor_pre    comb_fld_order_1     new
 CASE WHEN                                                              CASE WHEN
 WHEN                      AND                                      
 WHEN                      AND                          
 WHEN                                                                   WHEN
 WHEN                      AND
 WHEN                      OR                      
 WHEN                                                                   WHEN
 WHEN                                                 dummy
 WHEN                                                 dummy
 WHEN                                                                   WHEN

我为此写了一些逻辑,但它不起作用:

            df_main1['new'] =''
            for index,row in df_main1.iterrows():
                new = ''
                if((str(row['applied_sql_function1']) != '') and (str(row['and_or_not_oprtor_pre']) == '') and (str(row['comb_fld_order_1']) == '')):
                    new += str(row['applied_sql_function1'])
                    print(new)

                if(str(row['applied_sql_function1']) != '') and (str(row['and_or_not_oprtor_pre']) != ''):
                    new += ''
                    print(new)

                else:
                    new += ''

                row['new'] = new

            print(df_main1['new'])

Go 与np.where一路,很容易理解和矢量化。 所以在非常大的数据集上性能很好。

import pandas as pd, numpy as np
df['new'] = ''
df['new'] = np.where((df['and_or_not_oprtor_pre'] == '') & (df['comb_fld_order_1'] == ''), df['applied_sql_function1'], df['new'])
df

使用, 定位

mask = df.and_or_not_oprtor_pre.fillna("").eq("") \
       & df.comb_fld_order_1.fillna("").eq("")

df.loc[mask, 'new'] = df.loc[mask, 'applied_sql_function1']

试试这个,它会很快奏效

indexes = df.index[(df['and_or_not_oprtor_pre'].isna()) & (df['comb_fld_order_1'].isna())]
df.loc[indexes, 'new'] = df.loc[indexes, 'applied_sql_function1']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM