在 Python 中输入缺失值

Question

I want to impute a couple of columns in my data frame using Scikit-Learn SimpleImputer .我想使用 Scikit-Learn SimpleImputer在我的数据框中估算几列。 I tried doing this, but with no luck.我尝试这样做，但没有运气。 How should I modify my code?我应该如何修改我的代码？ a , b , e are the columns in my data frame that I want to impute. a , b , e是我要估算的数据框中的列。

My data frame:我的数据框：

    a   b   c   d      e
    NA  39  cat gray   20
    5   NA  dog brown  NA
    7   53  cat tan    33
    NA  NA  cat black  41
    4   24  dog tan    NA

My code:我的代码：

from sklearn.impute import SimpleImputer

miss_mean_imputer = SimpleImputer(missing_values='NaN', strategy='mean', axis=0)

miss_mean_imputer = miss_mean_imputer.fit(df["a", "b", "e"])

imputed_df = miss_mean_imputer.transform(df.values)

print(imputed_df)

Answer 1

You should replace missing_values='NaN' with missing_values=np.nan when instantiating the imputer and you should also make sure that the imputer is used to transform the same data to which it has been fitted, see the code below.在实例化输入missing_values=np.nan时，您应该将missing_values='NaN'替换为missing_values=np.nan ，并且您还应该确保输入器用于转换已拟合的相同数据，请参阅下面的代码。

import pandas as pd
import numpy as np
from sklearn.impute import SimpleImputer

df = pd.DataFrame({
 'a': [np.nan, 5.0, 7.0, np.nan, 4.0],
 'b': [39.0, np.nan, 53.0, np.nan, 24.0],
 'c': ['cat', 'dog', 'cat', 'cat', 'dog'],
 'd': ['gray', 'brown', 'tan', 'black', 'tan'],
 'e': [20.0, np.nan, 33.0, 41.0, np.nan]
})

imputer = SimpleImputer(missing_values=np.nan, strategy='mean')
imputer = imputer.fit(df[['a', 'b', 'e']])

imputed_df = df.copy()
imputed_df[['a', 'b', 'e']] = imputer.transform(df[['a', 'b', 'e']])

print(imputed_df)
#           a          b    c      d          e
# 0  5.333333  39.000000  cat   gray  20.000000
# 1  5.000000  38.666667  dog  brown  31.333333
# 2  7.000000  53.000000  cat    tan  33.000000
# 3  5.333333  38.666667  cat  black  41.000000
# 4  4.000000  24.000000  dog    tan  31.333333

在 Python 中输入缺失值

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-10-15 05:39:04

在 Python 中输入缺失值

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-10-15 05:39:04

解决方案1
1 已采纳 2021-10-15 05:39:04