简体   繁体   中英

Keras: multiclass classification

I am following the examples from here and here , and am totally new to Keras. It looks amazing - but I'm running into something I don't understand.

I have an 8-class classification problem. My training set has 5120 rows and 62 columns, the last column being the target variable.

My target variable is currently encoded as floats, so I convert them to integer and then to a dummy matrix for the model using to_categorical. The result is a numpy.ndarray of shape (num_samples, num_classes+1). Anyone know why?

Here's the code:

import numpy as np
from keras.utils.np_utils import to_categorical

dataset = np.loadtxt("train_pl.csv", delimiter=",")

# split into input (X) and output (Y) variables
X = dataset[:,0:61] #I have 5120 rows. 
Y = (dataset[:,62]).astype(int) #class labels 1 to 8 inclusive

#print Y.shape #(5120,)
#print np.unique(Y) #1 2 3 4 5 6 7 8 

y_binary = to_categorical(Y)

print y_binary.shape #(5120, 9) - why does this have 9 columns?

EDIT

The reason I didn't understand the answer given was I didn't understand that Keras was literally interpreting class labels as numbers. For example, since my classes are labelled 1 through 8, Keras look at the label '1' and says 'that's a 1 - I'll put it in the '1' position in the one-hot vector, like this: 0 1 0 0 0 0 0 0 0. It does the same with '2': 0 0 1 0 0 0 0 0 0, up to 8. That's why there's an extra column: to deal with the '0th' case, which doesn't exist in the mapping. Technically the accepted answer explained that, this just gives more detail.

因为to_categorical将类向量( 从0到nb_classes的整数)转换为二进制类矩阵,用于此处记录的categorical_crossentropy。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM