简体   繁体   中英

How to impute missing values with KNN

I'm trying to impute missing values from my data frames and for this I use fancyimpute library.

from fancyimpute import KNN 
X_filled_knn = KNN(k=3).complete(df_OppLine[['family']])

I v' got this error :

AttributeError                            Traceback (most recent call last)
<ipython-input-28-8475f35fc36a> in <module>()
----> 1 X_filled_knn = KNN(k=3).complete(df_OppLine[['family']])

AttributeError: 'KNN' object has no attribute 'complete'

Any idea to help me to fix this error?

Try changing it to:

from fancyimpute import KNN
X_filled_knn = KNN(k=3).fit_transform(df_OppLine[['family']])

First you got to convert strings into numerical data.

Try one-hot encoding (creates a column for each category and values are 1 only for the respective category and the rest are 0). You can also try Ordinal encoding. It assigns a value to each category

from sklearn.preprocessing import OrdinalEncoder

# Create Ordinal encoder
initialize_encoder=OrdinalEncoder()

# Select non-null values of family column
family=df_OppLine["family"]
family_not_null=family[family.notnull()]

# Reshape family_not_null to shape (-1, 1)
reshaped_vals=family_not_null.values.reshape(-1,1)

# Ordinally encode reshaped_vals
encoded_vals=initialize_encoder.fit_transform(reshaped_vals)

# Assign back encoded values to non-null values 
df_OppLine.loc[family.notnull(),"family"]=np.squeeze(encoded_vals)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM