[英]Having trouble encoding dataset array
Dataset: [1]: https://docs.google.com/spreadsheets/d/1jlKp7JR9Ewujv445QgT1kZpH5868fhXFFrA3ovWxS_0/edit?usp=sharing数据集:[1]: https://docs.google.com/spreadsheets/d/1jlKp7JR9Ewujv445QgT1kZpH5868fhXFFrA3ovWxS_0/edit?usp=sharing
I've been trying to deploy the ensemble method from sklearn to a small dataset I have linked above.我一直在尝试将集成方法从 sklearn 部署到我上面链接的一个小数据集。 For some reason I keep receiving this error.
出于某种原因,我不断收到此错误。
ValueError: y should be a 1d array, got an array of shape (9, 56) instead.
This is the code:这是代码:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from numpy import array
from sklearn import datasets, metrics
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder
cbdata = pd.read_excel("C:/Users/Andrew/cbupdated2.xlsx")
print(cbdata)
print(cbdata.describe())
df = cbdata.columns
print(df)
x = cbdata
y = cbdata.fundingstatus
xshape = x.shape
yshape = y.shape
shapes = xshape, yshape
print(shapes)
size = x.size, y.size
print(size)
###Problem ENCODING DATA
##Label encoder
label_encoder = LabelEncoder()
integer_encoded = label_encoder.fit_transform(x)
print(integer_encoded)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(x)
print(X_scaled)
###Problm block
ec = OneHotEncoder()
X_encoded = cbdata.apply(lambda col: ec.fit_transform(col.astype(str)), axis=0, result_type='expand')
X_encoded2 = X_encoded.shape
print(X_encoded2)
Any help and/or suggestions on getting encoder to work, so I can use the ensemble method???关于让编码器工作的任何帮助和/或建议,所以我可以使用 ensemble 方法???
Thanks谢谢
LabelEncoder
is meant for encoding target variables, not features. LabelEncoder
用于编码目标变量,而不是特征。 See also this post另见这篇文章
You should use OrdinalEncoder
on the categorical columns you want to transform, because I see some of your columns have floats and strings.您应该在要转换的分类列上使用
OrdinalEncoder
,因为我看到您的某些列有浮点数和字符串。 So for example to transform company
and industry
:例如,改造
company
和industry
:
from sklearn.preprocessing import OrdinalEncoder
Cols = ["company","industry"]
integer_encoded = OrdinalEncoder().fit_transform(x[Cols])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.