简体   繁体   English

我尝试在 sklearn 中进行估算,但出现错误

[英]I try imputing in sklearn but I have an error

I try below code but I have some error.我尝试下面的代码,但我有一些错误。

imp=SimpleImputer(missing_values='NaN',strategy="mean")
col = veriler.iloc[:,1:4].values
type(col) ##numpy.ndarray
imp=imp.fit(col)

ValueError: Input contains NaN, infinity or a value too large for dtype('float64'). ValueError:输入包含 NaN、无穷大或对于 dtype('float64') 来说太大的值。

You need to convert the infinity values to a bounded value to apply imputation.您需要将无穷大值转换为有界值以应用插补。 np.nan_to_num clips nan , inf and -inf to workable values. np.nan_to_num 将naninf-inf剪辑为可行的值。

For example:例如:

import numpy as np
from sklearn.impute import SimpleImputer
imp_mean = SimpleImputer(missing_values=np.nan, strategy='mean')
X = [[7, np.inf, 3], [4, np.nan, 6], [10, 5, 9]]
X = np.nan_to_num(X, nan=-9999, posinf=33333333, neginf=-33333333)
imp_mean.fit(X)
>>> SimpleImputer(add_indicator=False, copy=True, fill_value=None,
              missing_values=nan, strategy='mean', verbose=0)

For transform also, this can be applied:对于变换,这也可以应用:

X = [[np.nan, 2, 3], [4, np.nan, 6], [10, np.nan, 9], [np.nan, np.inf, -np.inf]]
X = np.nan_to_num(X, nan=-9999, posinf=33333333, neginf=-33333333)
print(imp_mean.transform(X))

>>>
[[-9.9990000e+03  2.0000000e+00  3.0000000e+00]
 [ 4.0000000e+00 -9.9990000e+03  6.0000000e+00]
 [ 1.0000000e+01 -9.9990000e+03  9.0000000e+00]
 [-9.9990000e+03  3.3333333e+07 -3.3333333e+07]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么在尝试sklearn PCA时内核会重新启动? - Why does the kernel restart when I try sklearn PCA? 我已经安装了 scikit-learn/sklearn。 运行 python 文件后出现此错误 - I have installed scikit-learm/sklearn. After running the python file this error is coming 使用 sklearn 时出现错误:ValueError:无法将字符串转换为浮点数: - I have an error when use sklearn : ValueError: could not convert string to float: 导入 Sklearn 时出现错误 - I Am Getting An Error When Importing Sklearn 我陷入了SKlearn的属性错误 - I'm stuck on with an attribute error with SKlearn 我更新了scikit.learn,但仍然收到此错误:ModuleNotFoundError:没有名为“ sklearn.cross_validation”的模块 - I updated scikit.learn but I have still get this error: ModuleNotFoundError: No module named 'sklearn.cross_validation' 我在理解sklearn的TfidfVectorizer结果时遇到问题 - I have a problem understanding sklearn's TfidfVectorizer results 在sklearn训练后是否必须再次使用fit()? - Do I have to use fit() again after training in sklearn? 即使我尝试使用训练数据进行预测,sklearn Logistic Regression 的准确性也太低 - sklearn Logistic Regression has too little accuracy even if I try to predict with the train data 为什么我会收到持久性 sklearn 模型的 unpickling 错误? - Why do I get unpickling error for persistent sklearn model?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM