简体   繁体   中英

Pandas DataFrame create new csv column based on two other columns

I need to create a new column in a csv called BTTS, which is based on two other columns, FTHG and FTAG. If FTHG & FTAG are both greater than zero, BTTS should be 1. Otherwise it should be zero.

What's the best way to do this in pandas / numpys?

I'm not sure, what the best way is. But here is one solution using pandas loc method:

df.loc[((df['FTHG'] > 0) & (df['FTAG'] > 0)),'BTTS'] = 1
df['BTTS'].fillna(0, inplace=True)

Another solution using pandas apply method:

def check_greater_zero(row):
    return 1 if row['FTHG'] > 0 & row['FTAG'] > 0 else 0

df['BTTS'] = df.apply(check_greater_zero, axis=1)

EDIT:

As stated in the comments, the first, vectorized, implementation is more efficient.

I dont know if this is the best way to do it but this works:)

df['BTTS'] = [1 if x == y == 1 else 0 for x, y in zip(df['FTAG'], df['FTHG'])]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM