ValueError：无法将字符串转换为浮点数：sklearn、numpy、panda

Question

Im trying to convert car names from NumPy array to numeric values to use for linear regressor.我试图将汽车名称从 NumPy 数组转换为数值以用于线性回归器。 The label encoder gives warning: ValueError: could not convert string to float: 'porsche' Can someone help, please?标签编码器发出警告：ValueError: could not convert string to float: 'porsche'有人可以帮忙吗？

Heres the code:代码如下：

 from sklearn.preprocessing import StandardScaler
 from sklearn.preprocessing import LabelEncoder, OneHotEncoder
 enc = LabelEncoder()
 enc.fit_transform(Z[:,0:1])
 onehotencoder = OneHotEncoder(categorical_features = [0])
 Z = onehotencoder.fit_transform(Z).toarray()`

and outoput: ValueError: could not convert string to float: 'porsche'和输出：ValueError：无法将字符串转换为浮点数：'porsche'

And here is the array: Array name = Z, type str416,这是数组：数组名称 = Z，类型 str416，

Answer 1

For one hot encoding, I would suggest you to use pd.get_dummies instead, much easier to use:对于一种热编码，我建议您改用pd.get_dummies ，这样更容易使用：

# make sure Z is a dataframe
X = pd.get_dummies(Z).values

If you want to use sklearn's OHE, you can refer to the following example:如果要使用sklearn的OHE，可以参考下面的例子：

from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder

df = pd.DataFrame({'a':['audi','porsche','audi'], 'b':[1,2,3]})
ohe = OneHotEncoder()

mat = ohe.fit_transform(df[['a']])

# view the contents of array
mat.todense()

matrix([[1., 0.],
        [0., 1.],
        [1., 0.]])

# get feature names
ohe.get_feature_names()
array(['x0_audi', 'x0_porsche'], dtype=object)

ValueError：无法将字符串转换为浮点数：sklearn、numpy、panda

问题描述

1 个解决方案

解决方案1
1 2020-01-19 10:27:32

ValueError：无法将字符串转换为浮点数：sklearn、numpy、panda

问题描述

1 个解决方案

解决方案1 1 2020-01-19 10:27:32

解决方案1
1 2020-01-19 10:27:32