简体   繁体   中英

Conditional transformation of column in pandas Dataframe

I would like to create a new column in my dataframe that is the difference between two variables IF a 3rd column in that row satisfies a certain condition.

A min example looks like this:

 dict1 = [{'var0': 0, 'var1': 1, 'var2': 2},
 {'var0': 0, 'var1': 2, 'var2': 4},
{'var0': 1, 'var1': 5, 'var2': 8},
{'var0': 1, 'var1': 15, 'var2': 12},]
df = pd.DataFrame(dict1, index=['s1', 's2','s3','s4'])

In particular I want the difference between var0 and var1 (var0-var1), for all rows where var 2 is larger than 3, else I want the difference between var0 and var2 (var0-var2)

My target output would be:

     var0  var1  var2 var3
 s1     0     1     2  -2
 s2     0     2     4  -2
 s3     1     5     8  -4
 s4     1    15    12  -14

You can do it in one line

import numpy as np

df['var3'] = np.where( df.var2 > 3, df['var0'] - df['var1'], df['var0'] - df['var2'])

This might do the trick

constraint = (df['var2'] < 3)
df.loc[constraint, 'var3'] = df['var0'] - df['var1']
df.loc[~constraint, 'var3'] = df['var0'] - df['var2']

This might be slow, but it should solve the issue.

df['var3'] = 0
for i in df.itertuples():
    if i.var2 > 3:
        amt = i.var0 - i.var1
        df.loc[i.Index,'var3'] = amt
    else:
        amt = i.var0 - i.var2
        df.loc[i.Index,'var3'] = amt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM