簡體   English   中英

ValueError:分類指標無法處理多標簽指標和連續多輸出目標錯誤的混合

[英]ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets error

如何修復此錯誤或我可以更改或實施什么?

ValueError:分類指標無法處理多標簽指標和連續多輸出目標的混合

下面附上完整代碼 關於 csv 文件的一些信息它有 15 個類 78 個輸入維度和超過 600,000 個樣本

dataframe = pandas.read_csv("C:/Users/bam/train.csv", header=0, dtype=object)
dataset = dataframe.values
X_train = dataset[:,0:78].astype(float)
y_train = dataset[:,78]
dataframe = pandas.read_csv("C:/Users/bam/test.csv", header=0, dtype=object)
dataset = dataframe.values
X_test = dataset[:,0:78].astype(float)
y_test = dataset[:,78]
##my version for encoding
encoder = LabelEncoder()
encoder.fit(y_train)
encoded_Yone = encoder.transform(y_train)
# convert integers to dummy variables (i.e. one hot encoded)
y_train = np_utils.to_categorical(encoded_Yone)
#encode our testing set
encoder = LabelEncoder()
encoder.fit(y_test)
encoded_Ytwo = encoder.transform(y_test)
# convert integers to dummy variables (i.e. one hot encoded)
y_test = np_utils.to_categorical(encoded_Ytwo)

#Creating an object of StandardScaler trial run to try and improve accuracy from 90% baseline*
sc = StandardScaler()
#Scaling the data using the StandardScaler() object
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

neural_classifier = Sequential()
#output_dim = number of nuerons in first hidden layer
#init = initializing the weights of the neural network
#input_dim = number of neuron in the input layer = number of input features = 78 
#Actiavtion = activation function that is used in each layer
#Dense is the type of layer
#The neural network needs to start with some weights and then iteratively update them to better values. The term kernel_initializer is a fancy term for which statistical distribution or function to use for initialising the weights. In case of statistical distribution, the library will generate numbers from that statistical distribution and use as starting weights.
#Input layer
neural_classifier.add(Dense(100, kernel_initializer = 'uniform', activation = 'relu', input_dim = 78))

neural_classifier.add(Dense(150, kernel_initializer = 'uniform', activation = 'relu'))
neural_classifier.add(Dense(200, kernel_initializer = 'uniform', activation = 'relu'))
neural_classifier.add(Dense(250, kernel_initializer = 'uniform', activation = 'relu'))
neural_classifier.add(Dense(300, kernel_initializer = 'uniform', activation = 'relu'))
neural_classifier.add(Dense(350, kernel_initializer = 'uniform', activation = 'relu'))
neural_classifier.add(Dense(400, kernel_initializer = 'uniform', activation = 'relu'))
neural_classifier.add(Dense(250, kernel_initializer = 'uniform', activation = 'relu'))
neural_classifier.add(Dense(300, kernel_initializer = 'uniform', activation = 'relu'))
# output layer has 15 neurons because there are 15 classes in dataset
#Since it is a multiclass classification problem hence we are using the softmax activation function
neural_classifier.add(Dense(15, kernel_initializer = 'uniform', activation = 'softmax'))
neural_classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
neural_classifier.fit(X_train, y_train, batch_size = 32, epochs = 2)
#Predicting the labels for the test data
y_pred = neural_classifier.predict(X_test)
#Calculating the accuracy score
accuracy = metrics.accuracy_score(y_test, y_pred)
#Calculating the precision score
precision = metrics.precision_score(y_test, y_pred)
#Calculating the recall score
recall = metrics.recall_score(y_test, y_pred, average='weighted')
#Calculating the f1-score
f1score = metrics.f1_score(y_test, y_pred, average='weighted')
print("Accuracy score of the model is :", accuracy)
print("precision score of the model is :", precision)
print("Recall score of the model is :", recall)
print("f1-score of the model is :", f1score)

完全錯誤

在您的示例中,您正在使用

# convert integers to dummy variables (i.e. one hot encoded)
y_train = np_utils.to_categorical(encoded_Yone)

您需要將 y_test 從一個熱編碼轉換回簡單數組,否則您無法使用 metrics.accuracy_score

使用 numpy 的一種方法是

y_lab = np.argmax(y_train, axis=1)

所以在文件的頭部只需像這樣導入 numpy

import numpy as np  

然后在調用 metrics.accuracy 之前創建 y_lab 變量

y_lab = np.argmax(y_train, axis=1)

並在 metrics.accuracy 中使用 y_lab

這里有一個鏈接來解釋如何以及為什么使用單熱編碼https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning/

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM