简体   繁体   English

如何创建一个 function 可以:将 0 替换为 NaN?

[英]How do I create a function that can: replace 0's to NaN?

How do I create a function that can: replace (0.0) to NaN, remove underscores, convert clean strings into a float datatype or otherwise return the converted data?如何创建 function 可以:将 (0.0) 替换为 NaN、删除下划线、将干净的字符串转换为浮点数据类型或以其他方式返回转换后的数据?

So far I have tried the following:到目前为止,我已经尝试了以下方法:

def score_cleaner(underscored): 
    if underscored == '_000':
         return np.NaN
           
long_data['Numeric Score']= long_data['Score'].apply(lambda x:(float(x.replace('_',''))))

long_data ['Numeric Score']= long_data ['Score'].apply(score_cleaner) 

However this has resulted in either an output of endless "NaNs", or all the numerical values rather than a combination of the two where 0.0's are converted to NaNs and the rest of the data is left alone:然而,这导致了无穷无尽的“NaN”的 output,或者所有数值而不是两者的组合,其中 0.0 被转换为 NaN,并且数据的 rest 被单独保留:

PID_Sex PID_Age ManipulationScoreFace IDCondition Numeric Score
103 Female  18  Symmetry    _005    101 Manipulated NaN
106 Female  19  Symmetry    _000    101 Manipulated NaN
106 Male    22  Symmetry    _000    101 Manipulated NaN
109 Male    20  Symmetry    _000    101 Manipulated NaN
112 Female  18  Symmetry    _000    101 Manipulated NaN 
115 Female  18  Symmetry    _000    101 Manipulated NaN
118 Female  19  Symmetry    _003    101 Manipulated NaN
121 Female  18  Symmetry    _000    101 Manipulated NaN
124 Female  19  Symmetry    _004    101 Manipulated NaN
127 Female  19  Symmetry    _005    101 Manipulated NaN

PID_Sex PID_Age ManipulationScoreFace IDConditionNumericScore
103 Female  18  Symmetry    _005    101 Manipulated 5.0
106 Female  19  Symmetry    _000    101 Manipulated 0.0
106 Male    22  Symmetry    _000    101 Manipulated 0.0
109 Male    20  Symmetry    _000    101 Manipulated 0.0
112 Female  18  Symmetry    _000    101 Manipulated 0.0
115 Female  18  Symmetry    _000    101 Manipulated 0.0
118 Female  19  Symmetry    _003    101 Manipulated 3.0
121 Female  18  Symmetry    _000    101 Manipulated 0.0
124 Female  19  Symmetry    _004    101 Manipulated 4.0
127 Female  19  Symmetry    _005    101 Manipulated 5.0

I don't exactly get what you want, so here are the two most likely options:我不完全得到你想要的,所以这里有两个最有可能的选择:


Option 1选项1

Convert the column to float type with '_000' being converted to np.nan and the rest to numeric values:将列转换为float类型,将'_000'转换为np.nan ,将 rest 转换为数值:

long_data['Numeric Score'] = long_data['Score'].str.replace('_', '').astype(float).replace(0., np.nan)

or as a function definition:或作为 function 定义:

def score_cleaner(underscore):
    return underscore.str.replace(
        '_', '').astype(float).replace(0., np.nan)

long_data['Numeric Score'] = score_cleaner(long_data['Score'])

Option 2选项 2

Convert the column to object type with '_000' being converted to the string 'NaN' and leave the rest as it is:将列转换为object类型,并将'_000'转换为字符串'NaN' ,并保留 rest 原样:

long_data['Numeric Score'] = long_data['Score'].str.replace('_000', 'NaN')

and again defined as a function:并再次定义为 function:

def score_cleaner(underscore):
    return underscore.str.replace('_000', 'NaN')

long_data['Numeric Score'] = score_cleaner(long_data['Score'])

You can use this:你可以使用这个:

df['Numeric Score'] = df['Score'].apply(lambda x:(float(x.replace('_',''))))
df['Numeric Score'][df['Score'] == '_000'] = np.NaN

To create a function you could you this:要创建 function,您可以这样做:

def score_cleaner(underscored): 
    if underscored == '_000':
         return np.NaN        
    else:
        return float(underscored.replace('_',''))

long_data ['Numeric Score']= long_data['Score'].map(score_cleaner) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM