简体   繁体   中英

Numpy ndArray: accessing input features of each class

For my current classification task, I am interesting in accessing the input features for the individual class such that each class is train on its input features only (weak classifier), later for ensemble of them.

I am having a challenge accessing these features. Admitted, I always get confused with multi-dimensional arrays. I give example of how I try to access class features in the following MWE.

import keras
import numpy as np
from sklearn.model_selection import train_test_split

Data = np.random.randn(20, 1, 5, 4)
x,y,z = np.repeat(0, 7), np.repeat(1, 7), np.repeat(2, 6)
labels = np.hstack((x,y,z))

LABELS= list(set(np.ndarray.flatten(labels)))
Class_num = len(LABELS)

trainX, testX, trainY, testY = train_test_split(Data, 
                      labels, test_size=0.20, random_state=42)

#...to categorical
trainY = keras.utils.to_categorical(trainY, num_classes=Class_num)
testY = keras.utils.to_categorical(testY, num_classes=Class_num)

ensemble = []
for i in range(len(LABELS)):
    print('Train on class ' ,LABELS[i])
    sub_train = trainX[trainY == i]
    sub_test = testX[testY == i]

    #model fit follows...

Error:

Train on class  0

---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

<ipython-input-11-52ceeb9a1011> in <module>()
     20 for i in range(len(LABELS)):
     21     print('Train on class ' ,LABELS[i])
---> 22     sub_train = trainX[trainY == i]
     23     sub_test = testX[testY == i]
     24 

IndexError: boolean index did not match indexed array along dimension 1; dimension is 1 but corresponding boolean dimension is 3

Apparently, I doing the array indexing wrong. Note the shape of trainX/testX .

Use argmax(axis=1) .

In your code, you call the function to_categorical on trainY . That gives you an array of shape (16, 3) where 3 is the number of classes:

[[0. 1. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]

Using argmax(axis=1) gives you the class id after this transformation: [1 0 1 0 2 2 1 0 1 2 0 1 1 1 2 0] .

All what you need to do here is to change line 22 and 23 with:

    sub_train = trainX[trainY.argmax(axis=1) == i]
    sub_test = testX[testY.argmax(axis=1) == i]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM