简体   繁体   中英

How do I create a function that can: replace 0's to NaN?

How do I create a function that can: replace (0.0) to NaN, remove underscores, convert clean strings into a float datatype or otherwise return the converted data?

So far I have tried the following:

def score_cleaner(underscored): 
    if underscored == '_000':
         return np.NaN
           
long_data['Numeric Score']= long_data['Score'].apply(lambda x:(float(x.replace('_',''))))

long_data ['Numeric Score']= long_data ['Score'].apply(score_cleaner) 

However this has resulted in either an output of endless "NaNs", or all the numerical values rather than a combination of the two where 0.0's are converted to NaNs and the rest of the data is left alone:

PID_Sex PID_Age ManipulationScoreFace IDCondition Numeric Score
103 Female  18  Symmetry    _005    101 Manipulated NaN
106 Female  19  Symmetry    _000    101 Manipulated NaN
106 Male    22  Symmetry    _000    101 Manipulated NaN
109 Male    20  Symmetry    _000    101 Manipulated NaN
112 Female  18  Symmetry    _000    101 Manipulated NaN 
115 Female  18  Symmetry    _000    101 Manipulated NaN
118 Female  19  Symmetry    _003    101 Manipulated NaN
121 Female  18  Symmetry    _000    101 Manipulated NaN
124 Female  19  Symmetry    _004    101 Manipulated NaN
127 Female  19  Symmetry    _005    101 Manipulated NaN

PID_Sex PID_Age ManipulationScoreFace IDConditionNumericScore
103 Female  18  Symmetry    _005    101 Manipulated 5.0
106 Female  19  Symmetry    _000    101 Manipulated 0.0
106 Male    22  Symmetry    _000    101 Manipulated 0.0
109 Male    20  Symmetry    _000    101 Manipulated 0.0
112 Female  18  Symmetry    _000    101 Manipulated 0.0
115 Female  18  Symmetry    _000    101 Manipulated 0.0
118 Female  19  Symmetry    _003    101 Manipulated 3.0
121 Female  18  Symmetry    _000    101 Manipulated 0.0
124 Female  19  Symmetry    _004    101 Manipulated 4.0
127 Female  19  Symmetry    _005    101 Manipulated 5.0

I don't exactly get what you want, so here are the two most likely options:


Option 1

Convert the column to float type with '_000' being converted to np.nan and the rest to numeric values:

long_data['Numeric Score'] = long_data['Score'].str.replace('_', '').astype(float).replace(0., np.nan)

or as a function definition:

def score_cleaner(underscore):
    return underscore.str.replace(
        '_', '').astype(float).replace(0., np.nan)

long_data['Numeric Score'] = score_cleaner(long_data['Score'])

Option 2

Convert the column to object type with '_000' being converted to the string 'NaN' and leave the rest as it is:

long_data['Numeric Score'] = long_data['Score'].str.replace('_000', 'NaN')

and again defined as a function:

def score_cleaner(underscore):
    return underscore.str.replace('_000', 'NaN')

long_data['Numeric Score'] = score_cleaner(long_data['Score'])

You can use this:

df['Numeric Score'] = df['Score'].apply(lambda x:(float(x.replace('_',''))))
df['Numeric Score'][df['Score'] == '_000'] = np.NaN

To create a function you could you this:

def score_cleaner(underscored): 
    if underscored == '_000':
         return np.NaN        
    else:
        return float(underscored.replace('_',''))

long_data ['Numeric Score']= long_data['Score'].map(score_cleaner) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM