簡體   English   中英

如何在熊貓中將價值分配給多價值

[英]How to assign value to multi-value in pandas

我要為All SkinThickness分配零值,每個患者的平均值在一定的Age范圍內。

因此,我按Age將數據幀分組,以獲取每個年齡段的SkinThickness平均值。

為了將SkinThickness列中的每個值分配給根據年齡分組計算出的相應平均值。

ageSkinMean = df_clean.groupby("Age_Class")["SkinThickness"].mean()
>>> ageSkinMean

Age_Class
21-22 years     82.163399
23-25 years    103.171429
26-30 years     91.170254
31-38 years     80.133028
39-47 years     73.685851
48-58 years     89.130233
60+ years       40.899160
Name: Insulin, dtype: float64

目前,我正在運行的代碼不足...使用iterrows()需要花費很長時間

start = time.time()
for i, val in df_clean[df_clean.SkinThickness == 0].iterrows():
    if val[7] < 22:
        df_clean.loc[i, "SkinThickness"] = ageSkinMean[0]
    elif val[7] < 25:
        df_clean.loc[i, "SkinThickness"] = ageSkinMean[1]
    elif val[7] < 30:
        df_clean.loc[i, "SkinThickness"] = ageSkinMean[2]
    elif val[7] < 38:
        df_clean.loc[i, "SkinThickness"] = ageSkinMean[3]
    elif val[7] < 47:
        df_clean.loc[i, "SkinThickness"] = ageSkinMean[4]
    elif val[7] < 58:
        df_clean.loc[i, "SkinThickness"] = ageSkinMean[5]
    else:
        df_clean.loc[i, "SkinThickness"] = ageSkinMean[6]
print(time.time() - start)

我想知道是否存在對此類代碼塊進行任何熊貓優化以使其運行更快

您可以使用pandas轉換功能將SkinThickness 0值替換為平均值

    age_skin_thickness_mean = df_clean.groupby('Age_Class')['SkinThickness'].mean()

    def replace_with_mean_thickness(row):
       row['SkinThickness'] = age_skin_thickness_mean[row['Age_Class']]
       return row

    df_clean.loc[df_clean['SkinThickness'] == 0] = df_clean.loc[df_clean['SkinThickness'] == 0].transform(replace_with_mean_thickness, axis=1)

現在,在df_clean中所有SkinThickness == 0的行的SkinThickness等於其年齡組平均值。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM