简体   繁体   中英

How Do I impute missing values using pandas?

I am trying to impute missing values as the mean of other values in the column; however, my code is having no effect. Does anyone know what I may be doing wrong? Thanks!

My code:

  from sklearn.preprocessing import Imputer
    imputer = Imputer(missing_values ='NaN', strategy = 
    'mean', axis = 0)
    imputer = imputer.fit(x[:, 1:3])
    x[:, 1:3] = imputer.transform(x[:, 1:3])
    print(dataset)

Output

Country   Age   Salary Purchased
0   France  44.0  72000.0        No
1    Spain  27.0  48000.0       Yes
2  Germany  30.0  54000.0        No
3    Spain  38.0  61000.0        No
4  Germany  40.0      NaN       Yes
5   France  35.0  58000.0       Yes
6    Spain   NaN  52000.0        No
7   France  48.0  79000.0       Yes
8  Germany  50.0  83000.0        No
9   France  37.0  67000.0       Yes

You can do the following, let's say df is your dataset:

from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values ='NaN', strategy = 'mean', axis = 0)

df[['Age','Salary']]=imputer.fit_transform(df[['Age','Salary']])

print(df)

   Country        Age        Salary Purchased
0   France  44.000000  72000.000000        No
1    Spain  27.000000  48000.000000       Yes
2  Germany  30.000000  54000.000000        No
3    Spain  38.000000  61000.000000        No
4  Germany  40.000000  63777.777778       Yes
5   France  35.000000  58000.000000       Yes
6    Spain  38.777778  52000.000000        No
7   France  48.000000  79000.000000       Yes
8  Germany  50.000000  83000.000000        No
9   France  37.000000  67000.000000       Yes

You're assigning an Imputer object to the variable imputer:

imputer = Imputer(missing_values ='NaN', strategy = 'mean', axis = 0)

You then call the fit() function on your Imputer object, and then the transform() function.

Then you print the dataset variable, which I'm not sure where it comes from. Did you mean to print the Imputer object, or the result of one of those calls instead?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM