[英]How Do I impute missing values using pandas?
I am trying to impute missing values as the mean of other values in the column; 我试图将缺少的值归为该列中其他值的平均值。 however, my code is having no effect. 但是,我的代码无效。 Does anyone know what I may be doing wrong? 有人知道我可能做错了吗? Thanks! 谢谢!
My code: 我的代码:
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values ='NaN', strategy =
'mean', axis = 0)
imputer = imputer.fit(x[:, 1:3])
x[:, 1:3] = imputer.transform(x[:, 1:3])
print(dataset)
Output 产量
Country Age Salary Purchased
0 France 44.0 72000.0 No
1 Spain 27.0 48000.0 Yes
2 Germany 30.0 54000.0 No
3 Spain 38.0 61000.0 No
4 Germany 40.0 NaN Yes
5 France 35.0 58000.0 Yes
6 Spain NaN 52000.0 No
7 France 48.0 79000.0 Yes
8 Germany 50.0 83000.0 No
9 France 37.0 67000.0 Yes
You can do the following, let's say df
is your dataset: 您可以执行以下操作,假设df
是您的数据集:
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values ='NaN', strategy = 'mean', axis = 0)
df[['Age','Salary']]=imputer.fit_transform(df[['Age','Salary']])
print(df)
Country Age Salary Purchased
0 France 44.000000 72000.000000 No
1 Spain 27.000000 48000.000000 Yes
2 Germany 30.000000 54000.000000 No
3 Spain 38.000000 61000.000000 No
4 Germany 40.000000 63777.777778 Yes
5 France 35.000000 58000.000000 Yes
6 Spain 38.777778 52000.000000 No
7 France 48.000000 79000.000000 Yes
8 Germany 50.000000 83000.000000 No
9 France 37.000000 67000.000000 Yes
You're assigning an Imputer object to the variable imputer: 您正在将Imputer对象分配给变量imputer:
imputer = Imputer(missing_values ='NaN', strategy = 'mean', axis = 0)
You then call the fit()
function on your Imputer object, and then the transform()
function. 然后,您在Imputer对象上调用fit()
函数,然后在transform()
函数上调用。
Then you print the dataset
variable, which I'm not sure where it comes from. 然后打印dataset
变量,我不确定它来自哪里。 Did you mean to print the Imputer object, or the result of one of those calls instead? 您是要打印Imputer对象,还是打印其中一个调用的结果?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.