简体   繁体   English

ValueError:形状不匹配:如果类别是一个数组,它必须是形状 (n_features,)

[英]ValueError: Shape mismatch: if categories is an array, it has to be of shape (n_features,)

I have create a simple code to implement OneHotEncoder .我创建了一个简单的代码来实现OneHotEncoder

from sklearn.preprocessing import OneHotEncoder
X = [[0, 'a'], [0, 'b'], [1, 'a'], [2, 'b']]
onehotencoder = OneHotEncoder(categories=[0])
X = onehotencoder.fit_transform(X).toarray()

I just want to use method called fit_transform to the X for index 0 , so it means for [0, 0, 1, 2] like what you see in X .我只想使用称为fit_transform方法到X的索引0 ,所以它意味着[0, 0, 1, 2]就像你在X看到的那样。 But it causes an error like this :但它会导致这样的错误:

ValueError: Shape mismatch: if categories is an array, it has to be of shape (n_features,).

Anyone can solve this problem ?任何人都可以解决这个问题? I am stuck on it我被困在它上面

You need to use ColumnTransformer to specify the column index not categories parameter.您需要使用ColumnTransformer来指定列索引而不是categories参数。

Constructor parameter categories is to tell distinct category values explicitly.构造函数参数categories是明确地告诉不同的类别值。 Eg you could provide [0, 1, 2] explicitly, but auto will determine it.例如,您可以明确提供[0, 1, 2] ,但auto会确定它。 Further, you can use slice() object instead.此外,您可以改用slice()对象。

from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer

X = [[0, 'a'], [0, 'b'], [1, 'a'], [2, 'b']]

ct = ColumnTransformer(
    [('one_hot_encoder', OneHotEncoder(categories='auto'), [0])],   # The column numbers to be transformed (here is [0] but can be [0, 1, 3])
    remainder='passthrough'                                         # Leave the rest of the columns untouched
)

X = ct.fit_transform(X)

pandas.get_dummies() method also can do same in the way below: pandas.get_dummies()方法也可以通过以下方式执行相同的操作:

import numpy as np
import pandas as pd
X = np.array([[0, 'a'], [0, 'b'], [1, 'a'], [2, 'b']])
X = np.array(pd.concat([pd.get_dummies(X[:, 0]), pd.DataFrame(X[:, 1])], axis = 1))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 形状不匹配:如果类别是一个数组,它必须是形状 (n_features,) - Shape mismatch: if categories is an array, it has to be of shape (n_features,) ValueError:输入已使用n_features = 4261训练模型时,输入的n_features = 10 - ValueError: Input has n_features=10 while the model has been trained with n_features=4261 为什么 sklearn.svm.SVC 的属性 coef_ 具有 shape = [n_class * (n_class-1) / 2, n_features]? - why sklearn.svm.SVC's attribute coef_ has shape = [n_class * (n_class-1) / 2, n_features]? (n_clusters,n_features)指的是什么形状? 以及如何使用 - What is the shape (n_clusters, n_features) referring to? and how to use it ValueError:形状不匹配:形状相同 - ValueError: shape mismatch: with same shape ValueError: Shape mismatch: if categories are an array, 即使将列指定为索引,错误也没有解决 - ValueError: Shape mismatch: if categories is an array, The error is not resolved even after specifying the columns as indexes ValueError:model 的特征数量必须与输入匹配。 Model n_features 为 3,输入 n_features 为 2 - ValueError: Number of features of the model must match the input. Model n_features is 3 and input n_features is 2 ValueError:形状不匹配 - ValueError: shape mismatch ValueError 形状不匹配 - ValueError shape mismatch ValueError: max_features 必须在 (0, n_features] in Random Forest - ValueError: max_features must be in (0, n_features] in Random Forest
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM