I need to create a new column in a csv called BTTS, which is based on two other columns, FTHG and FTAG. If FTHG & FTAG are both greater than zero, BTTS should be 1. Otherwise it should be zero.
What's the best way to do this in pandas / numpys?
I'm not sure, what the best way is. But here is one solution using pandas loc method:
df.loc[((df['FTHG'] > 0) & (df['FTAG'] > 0)),'BTTS'] = 1
df['BTTS'].fillna(0, inplace=True)
Another solution using pandas apply method:
def check_greater_zero(row):
return 1 if row['FTHG'] > 0 & row['FTAG'] > 0 else 0
df['BTTS'] = df.apply(check_greater_zero, axis=1)
EDIT:
As stated in the comments, the first, vectorized, implementation is more efficient.
I dont know if this is the best way to do it but this works:)
df['BTTS'] = [1 if x == y == 1 else 0 for x, y in zip(df['FTAG'], df['FTHG'])]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.