简体   繁体   中英

Create a pandas dataframe column depending if a value is null or not

I have Data science-related project about a course students took in 2016. I have a column which shows at what dates did the students upgrade their course. If the course has not been upgraded the value is Null. What I want is to create a new data frame consisting of only this upgraded column consisting of "yes" or "no". I have attempted the following code and it works, Except I get the following warning: "SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame." I am putting a sample dataset, the code and the output I got. If someone can tell me a more efficient way with an explanation, It will be great.

import pandas as pd

registration = pd.DataFrame({'upgraded':['2016-08-12 19:42:07+00:00', '2016-08-14 11:51:21+00:00',
    '2016-07-22 17:24:59+00:00', None, None, '2016-07-12 10:33:02+00:00']})

upgraded_1 = registration[['upgraded']]
for i in range(len(upgraded_1['upgraded'])):
    if pd.isnull(upgraded_1['upgraded'][i]):
        upgraded_1['upgraded'][i] = "No"
    else:
        upgraded_1['upgraded'][i] = "Yes"

Output:

 upgraded_1
    0   Yes
    1   Yes
    2   Yes
    3   No
    4   No
    5   Yes

You can achieve this with the isna method andnumpy.where (think of it as numpy.if_then_else ).

>>> pd.DataFrame(np.where(registration.isna(), 'No', 'Yes'))
     0
0  Yes
1  Yes
2  Yes
3   No
4   No
5  Yes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM