简体   繁体   English

Numpy ndArray:访问每个 class 的输入特征

[英]Numpy ndArray: accessing input features of each class

For my current classification task, I am interesting in accessing the input features for the individual class such that each class is train on its input features only (weak classifier), later for ensemble of them.对于我当前的分类任务,我对访问单个 class 的输入特征很感兴趣,这样每个 class 仅在其输入特征上进行训练(弱分类器),稍后用于它们的集合。

I am having a challenge accessing these features.我在访问这些功能时遇到了挑战。 Admitted, I always get confused with multi-dimensional arrays.承认,我总是对多维 arrays 感到困惑。 I give example of how I try to access class features in the following MWE.我将举例说明我如何尝试在以下 MWE 中访问 class 功能。

import keras
import numpy as np
from sklearn.model_selection import train_test_split

Data = np.random.randn(20, 1, 5, 4)
x,y,z = np.repeat(0, 7), np.repeat(1, 7), np.repeat(2, 6)
labels = np.hstack((x,y,z))

LABELS= list(set(np.ndarray.flatten(labels)))
Class_num = len(LABELS)

trainX, testX, trainY, testY = train_test_split(Data, 
                      labels, test_size=0.20, random_state=42)

#...to categorical
trainY = keras.utils.to_categorical(trainY, num_classes=Class_num)
testY = keras.utils.to_categorical(testY, num_classes=Class_num)

ensemble = []
for i in range(len(LABELS)):
    print('Train on class ' ,LABELS[i])
    sub_train = trainX[trainY == i]
    sub_test = testX[testY == i]

    #model fit follows...

Error:错误:

Train on class  0

---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

<ipython-input-11-52ceeb9a1011> in <module>()
     20 for i in range(len(LABELS)):
     21     print('Train on class ' ,LABELS[i])
---> 22     sub_train = trainX[trainY == i]
     23     sub_test = testX[testY == i]
     24 

IndexError: boolean index did not match indexed array along dimension 1; dimension is 1 but corresponding boolean dimension is 3

Apparently, I doing the array indexing wrong.显然,我做错了数组索引。 Note the shape of trainX/testX .注意trainX/testX的形状。

Use argmax(axis=1) .使用argmax(axis=1)

In your code, you call the function to_categorical on trainY .在您的代码中,您在to_categorical上调用 function trainY That gives you an array of shape (16, 3) where 3 is the number of classes:这给了你一个形状(16, 3)的数组,其中3是类的数量:

[[0. 1. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]

Using argmax(axis=1) gives you the class id after this transformation: [1 0 1 0 2 2 1 0 1 2 0 1 1 1 2 0] .使用argmax(axis=1)在此转换后为您提供 class id: [1 0 1 0 2 2 1 0 1 2 0 1 1 1 2 0]

All what you need to do here is to change line 22 and 23 with:您在这里需要做的就是将第 22 行和第 23 行更改为:

    sub_train = trainX[trainY.argmax(axis=1) == i]
    sub_test = testX[testY.argmax(axis=1) == i]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM