简体   繁体   中英

How to create a new column in pandas dataframe based on a condition?

I have a data frame with the following columns:

d = {'find_no': [1, 2, 3], 'zip_code': [32351, 19207, 8723]}
df = pd.DataFrame(data=d)

When there are 5 digits in the zip_code column, I want to return True. When there are not 5 digits, I want to return the "find_no". Sample output would have the results in an added column to the dataframe, corresponding to the row it's referencing.

You could try np.where:

import numpy as np

df['result'] = np.where(df['zip_code'].astype(str).str.len() == 5, True, df['find_no'])

Only downside with this approach is that NumPy will convert your True values to 1's, which could be confusing. An approach to keep the values you want is to do

import numpy as np

df['result'] = np.where(df['zip_code'].astype(str).str.len() == 5, 'True', df['find_no'].astype(str))

The downside here being that you lose the meaning of those values by casting them to strings. I guess it all depends on what you're hoping to accomplish.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM