简体   繁体   中英

How to populate values in a column of dataframe based on a particular condition

I have the following dataframe:

match   final_val   index  
False    0.002481   106278  
True     0.003135   106279  
True     0.000760   106280  
False    0.001313   106281  
True     0.000242   106282  

I wish to create a new column named finally_final_KL that takes the value from the final_val column only when match in the next row is False . For example, I want something like this:

match   final_val   index  finally_final_KL
False    0.002481   106278  
True     0.003135   106279  
True     0.000760   106280  0.000760
False    0.001313   106281  
True     0.000242   106282  

I wrote two snippets of code, both of which did not work. Code 1:

for i in range(len(df['index'])):
    if i-1<0:
        continue
    elif df.match.iloc[i] == False:
        df.finally_final_KL.iloc[i-1] == df.final_val[i-1]
    else:
        df.finally_final_KL.iloc[i] == ""

I do not get ANYTHING in the last column. The output of this code is something like this:

match   final_val   index  finally_final_KL
False    0.002481   106278  
True     0.003135   106279  
True     0.000760   106280  
False    0.001313   106281  
True     0.000242   106282  

Code 2:

df['finally_final_KL'] = df['index'].apply(lambda x: df['final_val'].iloc[x] if df['match'].iloc[x+1]==False else "")

This throws an Index error single positional indexer is out-of-bounds . I know I am getting this because the code does not work for the last row. But if I use [x-1] instead of [x+1] , it works fine (I did not expect it to work for x==0 this time) but of course I don't get the desired output.

I would be grateful if someone can either point out faults in my code or give me a new solution.

Use:

import numpy as np
df['finally_final_KL'] = np.where(df['match'].shift(-1) == False, df['final_val'], '')
print(df)

   match  final_val   index       finally_final_KL
0  False   0.002481  106278                       
1   True   0.003135  106279                       
2   True   0.000760  106280  0.0007599999999999999
3  False   0.001313  106281                       
4   True   0.000242  106282  

Depending on the datatypes in the original DataFrame the result might differ like seen above

You are trying to check the next row of match . For that purpose you can use shift(-1) . Then you can use mask or np.where :

df['finally_final_XL'] = df['final_val'].mask(df['match'].shift(-1, fill_value=True), '')

or with np.where :

df['finally_final_XL'] = np.where(df['match'].shift(-1, fill_value=True),
                                  '', df['final_val'])

Output:

   match  final_val   index finally_final_XL
0  False   0.002481  106278                 
1   True   0.003135  106279                 
2   True   0.000760  106280          0.00076
3  False   0.001313  106281                 
4   True   0.000242  106282                 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM