[英]ValueError: could not convert string to float: sklearn, numpy, panda
Im trying to convert car names from NumPy array to numeric values to use for linear regressor.我试图将汽车名称从 NumPy 数组转换为数值以用于线性回归器。 The label encoder gives warning: ValueError: could not convert string to float: 'porsche' Can someone help, please?标签编码器发出警告:ValueError: could not convert string to float: 'porsche'有人可以帮忙吗?
Heres the code:代码如下:
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
enc = LabelEncoder()
enc.fit_transform(Z[:,0:1])
onehotencoder = OneHotEncoder(categorical_features = [0])
Z = onehotencoder.fit_transform(Z).toarray()`
and outoput: ValueError: could not convert string to float: 'porsche'和输出:ValueError:无法将字符串转换为浮点数:'porsche'
And here is the array: Array name = Z, type str416,这是数组:数组名称 = Z,类型 str416,
For one hot encoding, I would suggest you to use pd.get_dummies
instead, much easier to use:对于一种热编码,我建议您改用pd.get_dummies
,这样更容易使用:
# make sure Z is a dataframe
X = pd.get_dummies(Z).values
If you want to use sklearn's OHE, you can refer to the following example:如果要使用sklearn的OHE,可以参考下面的例子:
from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder
df = pd.DataFrame({'a':['audi','porsche','audi'], 'b':[1,2,3]})
ohe = OneHotEncoder()
mat = ohe.fit_transform(df[['a']])
# view the contents of array
mat.todense()
matrix([[1., 0.],
[0., 1.],
[1., 0.]])
# get feature names
ohe.get_feature_names()
array(['x0_audi', 'x0_porsche'], dtype=object)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.