[英]How do I create a function that can: replace 0's to NaN?
How do I create a function that can: replace (0.0) to NaN, remove underscores, convert clean strings into a float datatype or otherwise return the converted data?如何创建 function 可以:将 (0.0) 替换为 NaN、删除下划线、将干净的字符串转换为浮点数据类型或以其他方式返回转换后的数据?
So far I have tried the following:到目前为止,我已经尝试了以下方法:
def score_cleaner(underscored):
if underscored == '_000':
return np.NaN
long_data['Numeric Score']= long_data['Score'].apply(lambda x:(float(x.replace('_',''))))
long_data ['Numeric Score']= long_data ['Score'].apply(score_cleaner)
However this has resulted in either an output of endless "NaNs", or all the numerical values rather than a combination of the two where 0.0's are converted to NaNs and the rest of the data is left alone:然而,这导致了无穷无尽的“NaN”的 output,或者所有数值而不是两者的组合,其中 0.0 被转换为 NaN,并且数据的 rest 被单独保留:
PID_Sex PID_Age ManipulationScoreFace IDCondition Numeric Score
103 Female 18 Symmetry _005 101 Manipulated NaN
106 Female 19 Symmetry _000 101 Manipulated NaN
106 Male 22 Symmetry _000 101 Manipulated NaN
109 Male 20 Symmetry _000 101 Manipulated NaN
112 Female 18 Symmetry _000 101 Manipulated NaN
115 Female 18 Symmetry _000 101 Manipulated NaN
118 Female 19 Symmetry _003 101 Manipulated NaN
121 Female 18 Symmetry _000 101 Manipulated NaN
124 Female 19 Symmetry _004 101 Manipulated NaN
127 Female 19 Symmetry _005 101 Manipulated NaN
PID_Sex PID_Age ManipulationScoreFace IDConditionNumericScore
103 Female 18 Symmetry _005 101 Manipulated 5.0
106 Female 19 Symmetry _000 101 Manipulated 0.0
106 Male 22 Symmetry _000 101 Manipulated 0.0
109 Male 20 Symmetry _000 101 Manipulated 0.0
112 Female 18 Symmetry _000 101 Manipulated 0.0
115 Female 18 Symmetry _000 101 Manipulated 0.0
118 Female 19 Symmetry _003 101 Manipulated 3.0
121 Female 18 Symmetry _000 101 Manipulated 0.0
124 Female 19 Symmetry _004 101 Manipulated 4.0
127 Female 19 Symmetry _005 101 Manipulated 5.0
I don't exactly get what you want, so here are the two most likely options:我不完全得到你想要的,所以这里有两个最有可能的选择:
Convert the column to float
type with '_000'
being converted to np.nan
and the rest to numeric values:将列转换为
float
类型,将'_000'
转换为np.nan
,将 rest 转换为数值:
long_data['Numeric Score'] = long_data['Score'].str.replace('_', '').astype(float).replace(0., np.nan)
or as a function definition:或作为 function 定义:
def score_cleaner(underscore):
return underscore.str.replace(
'_', '').astype(float).replace(0., np.nan)
long_data['Numeric Score'] = score_cleaner(long_data['Score'])
Convert the column to object
type with '_000'
being converted to the string 'NaN'
and leave the rest as it is:将列转换为
object
类型,并将'_000'
转换为字符串'NaN'
,并保留 rest 原样:
long_data['Numeric Score'] = long_data['Score'].str.replace('_000', 'NaN')
and again defined as a function:并再次定义为 function:
def score_cleaner(underscore):
return underscore.str.replace('_000', 'NaN')
long_data['Numeric Score'] = score_cleaner(long_data['Score'])
You can use this:你可以使用这个:
df['Numeric Score'] = df['Score'].apply(lambda x:(float(x.replace('_',''))))
df['Numeric Score'][df['Score'] == '_000'] = np.NaN
To create a function you could you this:要创建 function,您可以这样做:
def score_cleaner(underscored):
if underscored == '_000':
return np.NaN
else:
return float(underscored.replace('_',''))
long_data ['Numeric Score']= long_data['Score'].map(score_cleaner)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.