[英]How i can replace the XNA value with NAN in dataframe using Replace function?
[英]How do I create a function that can: replace 0's to NaN?
如何創建 function 可以:將 (0.0) 替換為 NaN、刪除下划線、將干凈的字符串轉換為浮點數據類型或以其他方式返回轉換后的數據?
到目前為止,我已經嘗試了以下方法:
def score_cleaner(underscored):
if underscored == '_000':
return np.NaN
long_data['Numeric Score']= long_data['Score'].apply(lambda x:(float(x.replace('_',''))))
long_data ['Numeric Score']= long_data ['Score'].apply(score_cleaner)
然而,這導致了無窮無盡的“NaN”的 output,或者所有數值而不是兩者的組合,其中 0.0 被轉換為 NaN,並且數據的 rest 被單獨保留:
PID_Sex PID_Age ManipulationScoreFace IDCondition Numeric Score
103 Female 18 Symmetry _005 101 Manipulated NaN
106 Female 19 Symmetry _000 101 Manipulated NaN
106 Male 22 Symmetry _000 101 Manipulated NaN
109 Male 20 Symmetry _000 101 Manipulated NaN
112 Female 18 Symmetry _000 101 Manipulated NaN
115 Female 18 Symmetry _000 101 Manipulated NaN
118 Female 19 Symmetry _003 101 Manipulated NaN
121 Female 18 Symmetry _000 101 Manipulated NaN
124 Female 19 Symmetry _004 101 Manipulated NaN
127 Female 19 Symmetry _005 101 Manipulated NaN
PID_Sex PID_Age ManipulationScoreFace IDConditionNumericScore
103 Female 18 Symmetry _005 101 Manipulated 5.0
106 Female 19 Symmetry _000 101 Manipulated 0.0
106 Male 22 Symmetry _000 101 Manipulated 0.0
109 Male 20 Symmetry _000 101 Manipulated 0.0
112 Female 18 Symmetry _000 101 Manipulated 0.0
115 Female 18 Symmetry _000 101 Manipulated 0.0
118 Female 19 Symmetry _003 101 Manipulated 3.0
121 Female 18 Symmetry _000 101 Manipulated 0.0
124 Female 19 Symmetry _004 101 Manipulated 4.0
127 Female 19 Symmetry _005 101 Manipulated 5.0
我不完全得到你想要的,所以這里有兩個最有可能的選擇:
將列轉換為float
類型,將'_000'
轉換為np.nan
,將 rest 轉換為數值:
long_data['Numeric Score'] = long_data['Score'].str.replace('_', '').astype(float).replace(0., np.nan)
或作為 function 定義:
def score_cleaner(underscore):
return underscore.str.replace(
'_', '').astype(float).replace(0., np.nan)
long_data['Numeric Score'] = score_cleaner(long_data['Score'])
將列轉換為object
類型,並將'_000'
轉換為字符串'NaN'
,並保留 rest 原樣:
long_data['Numeric Score'] = long_data['Score'].str.replace('_000', 'NaN')
並再次定義為 function:
def score_cleaner(underscore):
return underscore.str.replace('_000', 'NaN')
long_data['Numeric Score'] = score_cleaner(long_data['Score'])
你可以使用這個:
df['Numeric Score'] = df['Score'].apply(lambda x:(float(x.replace('_',''))))
df['Numeric Score'][df['Score'] == '_000'] = np.NaN
要創建 function,您可以這樣做:
def score_cleaner(underscored):
if underscored == '_000':
return np.NaN
else:
return float(underscored.replace('_',''))
long_data ['Numeric Score']= long_data['Score'].map(score_cleaner)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.