簡體   English   中英

如何創建一個 function 可以:將 0 替換為 NaN?

[英]How do I create a function that can: replace 0's to NaN?

如何創建 function 可以:將 (0.0) 替換為 NaN、刪除下划線、將干凈的字符串轉換為浮點數據類型或以其他方式返回轉換后的數據?

到目前為止,我已經嘗試了以下方法:

def score_cleaner(underscored): 
    if underscored == '_000':
         return np.NaN
           
long_data['Numeric Score']= long_data['Score'].apply(lambda x:(float(x.replace('_',''))))

long_data ['Numeric Score']= long_data ['Score'].apply(score_cleaner) 

然而,這導致了無窮無盡的“NaN”的 output,或者所有數值而不是兩者的組合,其中 0.0 被轉換為 NaN,並且數據的 rest 被單獨保留:

PID_Sex PID_Age ManipulationScoreFace IDCondition Numeric Score
103 Female  18  Symmetry    _005    101 Manipulated NaN
106 Female  19  Symmetry    _000    101 Manipulated NaN
106 Male    22  Symmetry    _000    101 Manipulated NaN
109 Male    20  Symmetry    _000    101 Manipulated NaN
112 Female  18  Symmetry    _000    101 Manipulated NaN 
115 Female  18  Symmetry    _000    101 Manipulated NaN
118 Female  19  Symmetry    _003    101 Manipulated NaN
121 Female  18  Symmetry    _000    101 Manipulated NaN
124 Female  19  Symmetry    _004    101 Manipulated NaN
127 Female  19  Symmetry    _005    101 Manipulated NaN

PID_Sex PID_Age ManipulationScoreFace IDConditionNumericScore
103 Female  18  Symmetry    _005    101 Manipulated 5.0
106 Female  19  Symmetry    _000    101 Manipulated 0.0
106 Male    22  Symmetry    _000    101 Manipulated 0.0
109 Male    20  Symmetry    _000    101 Manipulated 0.0
112 Female  18  Symmetry    _000    101 Manipulated 0.0
115 Female  18  Symmetry    _000    101 Manipulated 0.0
118 Female  19  Symmetry    _003    101 Manipulated 3.0
121 Female  18  Symmetry    _000    101 Manipulated 0.0
124 Female  19  Symmetry    _004    101 Manipulated 4.0
127 Female  19  Symmetry    _005    101 Manipulated 5.0

我不完全得到你想要的,所以這里有兩個最有可能的選擇:


選項1

將列轉換為float類型,將'_000'轉換為np.nan ,將 rest 轉換為數值:

long_data['Numeric Score'] = long_data['Score'].str.replace('_', '').astype(float).replace(0., np.nan)

或作為 function 定義:

def score_cleaner(underscore):
    return underscore.str.replace(
        '_', '').astype(float).replace(0., np.nan)

long_data['Numeric Score'] = score_cleaner(long_data['Score'])

選項 2

將列轉換為object類型,並將'_000'轉換為字符串'NaN' ,並保留 rest 原樣:

long_data['Numeric Score'] = long_data['Score'].str.replace('_000', 'NaN')

並再次定義為 function:

def score_cleaner(underscore):
    return underscore.str.replace('_000', 'NaN')

long_data['Numeric Score'] = score_cleaner(long_data['Score'])

你可以使用這個:

df['Numeric Score'] = df['Score'].apply(lambda x:(float(x.replace('_',''))))
df['Numeric Score'][df['Score'] == '_000'] = np.NaN

要創建 function,您可以這樣做:

def score_cleaner(underscored): 
    if underscored == '_000':
         return np.NaN        
    else:
        return float(underscored.replace('_',''))

long_data ['Numeric Score']= long_data['Score'].map(score_cleaner) 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM