I would like to create a new column in my dataframe that is the difference between two variables IF a 3rd column in that row satisfies a certain condition.
A min example looks like this:
dict1 = [{'var0': 0, 'var1': 1, 'var2': 2},
{'var0': 0, 'var1': 2, 'var2': 4},
{'var0': 1, 'var1': 5, 'var2': 8},
{'var0': 1, 'var1': 15, 'var2': 12},]
df = pd.DataFrame(dict1, index=['s1', 's2','s3','s4'])
In particular I want the difference between var0 and var1 (var0-var1), for all rows where var 2 is larger than 3, else I want the difference between var0 and var2 (var0-var2)
My target output would be:
var0 var1 var2 var3
s1 0 1 2 -2
s2 0 2 4 -2
s3 1 5 8 -4
s4 1 15 12 -14
You can do it in one line
import numpy as np
df['var3'] = np.where( df.var2 > 3, df['var0'] - df['var1'], df['var0'] - df['var2'])
This might do the trick
constraint = (df['var2'] < 3)
df.loc[constraint, 'var3'] = df['var0'] - df['var1']
df.loc[~constraint, 'var3'] = df['var0'] - df['var2']
This might be slow, but it should solve the issue.
df['var3'] = 0
for i in df.itertuples():
if i.var2 > 3:
amt = i.var0 - i.var1
df.loc[i.Index,'var3'] = amt
else:
amt = i.var0 - i.var2
df.loc[i.Index,'var3'] = amt
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.