Keras(FIT_GENERATOR)- 检查目标时出错：预期 activation_1 具有 3 个维度，但得到的数组形状为 (32, 416, 608, 3)

Question

I've been working on a segmentation problem for many days and after finally finding out how to properly read the dataset, I ran into this problem:我已经研究分割问题很多天了，在终于找到了如何正确读取数据集之后，我遇到了这个问题：

 ValueError: Error when checking target: expected activation_1(Softmax) to have 3 dimensions, but got array with shape

(32, 416, 608, 3) (32, 416, 608, 3)

I used the functional API, since I took the FCNN architecture from [here](https://github.com/divamgupta/image-segmentation-keras/blob/master/Models/FCN32.py).

It is slightly modified and adapted in accordance with my task(IMAGE_ORDERING = "channels_last"(TensorFlow backend)).它根据我的任务（IMAGE_ORDERING =“channels_last”（TensorFlow后端））稍微修改和改编。 Can anyone please help me?任何人都可以帮助我吗？ Massive thanks in advance.提前致谢。 The architecture below is for FCNN, which I try to implement for the purpose of the segmentation.下面的架构适用于 FCNN，我尝试实现它以实现分割。 Here is the architecture(after calling model.summary()):这是架构（在调用 model.summary() 之后）：

1. 1.

2. 2.

The specific error is:具体错误是：
"Importing the dataset" function: “导入数据集”功能：

"Fit_Generator method calling": “Fit_Generator 方法调用”：

 img_input = Input(shape=(input_height,input_width,3)) #Block 1 x = Convolution2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1', data_format=IMAGE_ORDERING)(img_input) x = BatchNormalization()(x) x = Convolution2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2', data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool', data_format=IMAGE_ORDERING)(x) f1 = x # Block 2 x = Convolution2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1', data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = Convolution2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2', data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool', data_format=IMAGE_ORDERING )(x) f2 = x # Block 3 x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1', data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2', data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3', data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool', data_format=IMAGE_ORDERING )(x) f3 = x # Block 4 x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1', data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2',data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3',data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool', data_format=IMAGE_ORDERING)(x) f4 = x # Block 5 x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1', data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2',data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3', data_format=IMAGE_ORDERING)(x) x = BatchNormalization()(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', data_format=IMAGE_ORDERING)(x) f5 = x x = (Convolution2D(4096,(7,7) , activation='relu' , padding='same', data_format=IMAGE_ORDERING))(x) x = Dropout(0.5)(x) x = (Convolution2D(4096,(1,1) , activation='relu' , padding='same',data_format=IMAGE_ORDERING))(x) x = Dropout(0.5)(x) #First parameter = number of classes+1 (de la background) x = (Convolution2D(20,(1,1) ,kernel_initializer='he_normal' ,data_format=IMAGE_ORDERING))(x) x = Convolution2DTranspose(20,kernel_size=(64,64), strides=(32,32),use_bias=False,data_format=IMAGE_ORDERING)(x) o_shape = Model(img_input,x).output_shape outputHeight = o_shape[1] print('Output Height is:', outputHeight) outputWidth = o_shape[2] print('Output Width is:', outputWidth) #https://keras.io/layers/core/#reshape x = (Reshape((20,outputHeight*outputWidth)))(x) #https://keras.io/layers/core/#permute x = (Permute((2, 1)))(x) print("Output shape before softmax is", o_shape) x = (Activation('softmax'))(x) print("Output shape after softmax is", o_shape) model = Model(inputs = img_input,outputs = x) model.outputWidth = outputWidth model.outputHeight = outputHeight model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics =['accuracy'])

Answer 1

The original code in the FCNN architecture example works with an input dimension of (416, 608) . FCNN 架构示例中的原始代码适用于(416, 608)的输入维度。 Whereas in your code, the input dimension is (192, 192) (ignoring the channel dimension).而在您的代码中，输入维度是(192, 192) （忽略通道维度）。 Now if you notice carefully, this particular layer现在如果你仔细观察，这个特殊的层

x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', data_format=IMAGE_ORDERING)(x)

generates an output of dimension (6, 6) (you can verify in your model.summary() ).生成维度(6, 6)的输出（您可以在model.summary()进行验证）。

The next convoltuion layer下一个卷积层

o = (Convolution2D(4096,(7,7) , activation='relu' , padding='same', data_format=IMAGE_ORDERING))(o)

uses convolution filters of size (7, 7) , but your input has already reduced to a size smaller than that (ie (6, 6) ).使用大小为(7, 7)卷积滤波器，但您的输入已经减小到小于该大小（即(6, 6) ）。 Try fixing that first.尝试先修复它。

Also if you look at the model.summary() output, you'll notice that it does not contain the layers defined after the block5_pool layer.此外，如果您查看model.summary()输出，您会注意到它不包含在block5_pool层之后定义的层。 There is a transposed convolution layer in it (which basically upsamples your input).其中有一个transposed convolution层（它基本上对您的输入进行了上采样）。 You may want to take a look and try to resolve that as well.您可能想看看并尝试解决这个问题。

NOTE : In all my dimensions, I have ignored the channel dimension.注意：在我所有的维度中，我都忽略了通道维度。

EDIT Detailed Answer below编辑下面的详细答案

First of all, this is my keras.json file.首先，这是我的keras.json文件。 It uses Tensorflow backend, with image_ordering set at channel_last .它使用Tensorflow后端，在channel_last设置image_ordering 。

{
    "floatx": "float32",
    "epsilon": 1e-07,
    "backend": "tensorflow",
    "image_data_format": "channels_last"
}

Next, I copy paste my exact model code.接下来，我复制粘贴我的确切型号代码。 Please take special note of the inline comments in the code below.请特别注意下面代码中的内联注释。

from keras.models import *
from keras.layers import *

IMAGE_ORDERING = 'channels_last' # In consistency with the json file

def getFCN32(nb_classes = 20, input_height = 416, input_width = 608):

    img_input = Input(shape=(input_height,input_width, 3)) # Expected input will have channel in the last dimension

    #Block 1
    x = Convolution2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1', data_format=IMAGE_ORDERING)(img_input) 
    x = BatchNormalization()(x)
    x = Convolution2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool', data_format=IMAGE_ORDERING)(x)
    f1 = x
    # Block 2
    x = Convolution2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool', data_format=IMAGE_ORDERING )(x)
    f2 = x

    # Block 3
    x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool', data_format=IMAGE_ORDERING )(x)
    f3 = x

    # Block 4
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2',data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3',data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool', data_format=IMAGE_ORDERING)(x)
    f4 = x

    # Block 5
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2',data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', data_format=IMAGE_ORDERING)(x)
    f5 = x

    x = (Convolution2D(4096,(7,7) , activation='relu' , padding='same', data_format=IMAGE_ORDERING))(x)
    x = Dropout(0.5)(x)
    x = (Convolution2D(4096,(1,1) , activation='relu' , padding='same',data_format=IMAGE_ORDERING))(x)
    x = Dropout(0.5)(x)

    x = (Convolution2D(20,(1,1) ,kernel_initializer='he_normal' ,data_format=IMAGE_ORDERING))(x)
    x = Convolution2DTranspose(20,kernel_size=(64,64), strides=(32,32),use_bias=False,data_format=IMAGE_ORDERING)(x)
    o_shape = Model(img_input, x).output_shape

    # NOTE: Since this is channel last dimension ordering, the height and width dimensions are along [1] and [2], not [2] and [3]
    outputHeight = o_shape[1]
    outputWidth = o_shape[2]

    x = (Reshape((outputHeight*outputWidth, 20)))(x) # Channel should be along the last dimenion of reshape
    # No need of permute layer anymore

    print("Output shape before softmax is", o_shape)
    x = (Activation('softmax'))(x)
    print("Output shape after softmax is", o_shape)
    model = Model(inputs = img_input,outputs = x)
    model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics =['accuracy'])

    return model

model = getFCN32(20)
print(model.summary())

Next I will provide with snippets of how my model.summary() looks.接下来，我将提供我的model.summary()外观的片段。 If you take a look at the last few layers, it is something like this:如果你看一下最后几层，它是这样的：

So this means, the conv2d_transpose layer produces an output of dimension (448, 640, 20) and flattens it out before applying softmax on it.因此，这意味着，该conv2d_transpose层产生尺寸的输出(448, 640, 20)并在其上施加前SOFTMAX变平出来。 So dimension of output is (286720, 20) .所以输出的维度是(286720, 20) 。 Similarly your target_generator ( mask_generator in your case) should also generate targets of similar dimension.同样，您的target_generator （在您的情况下为mask_generator ）也应该生成类似维度的目标。 Similarly, your input_generator should also be producing input batches of size [batch size, input_height,input_width, 3] , as mentioned in the img_input of your function.同样，您的input_generator也应该生成大小为[batch size, input_height,input_width, 3]输入批次，如函数的img_input中所述。

Hopefully this will help you to get to the bottom of your problem and figure out a suitable solution.希望这将帮助您找到问题的根源并找出合适的解决方案。 Please take a look at the minor variations in the code (along with the in-line comments) and how to create your input and target batches.请查看代码中的细微变化（以及内嵌注释）以及如何创建输入和目标批次。

Answer 2

I tried using SegNet architecture and yet again and I get the exact same error.我再次尝试使用 SegNet 架构，但我得到了完全相同的错误。 It appears it is not an architectural problem, but one from fit_generator && from using masks.看来这不是架构问题，而是来自 fit_generator && 来自使用掩码的问题。

UPDATE: The problem was solved by feeding the correct form of the input masks to the neural network.更新：通过向神经网络提供正确形式的输入掩码解决了该问题。

Answer 3

You're probably missing color_mode='grayscale' in the flow_from_directory() call for the mask.您可能在flow_from_directory()调用中缺少color_mode='grayscale'掩码。 RGB is the default value for color_mode . RGB 是color_mode的默认值。

flow_args = dict(
    batch_size=batch_size,
    target_size=target_size,
    class_mode=None,
    seed=seed)

image_generator = image_datagen.flow_from_directory(
    image_dir, subset='training', **flow_args)

mask_generator = mask_datagen.flow_from_directory(
    mask_dir, subset='training', color_mode='grayscale', **flow_args)

Keras(FIT_GENERATOR)- 检查目标时出错：预期 activation_1 具有 3 个维度，但得到的数组形状为 (32, 416, 608, 3)

问题描述

3 个解决方案

解决方案1
1 已采纳 2018-04-17 12:04:24

解决方案2
0 2018-04-17 17:46:42

解决方案3
0 2018-07-14 05:16:53

Keras(FIT_GENERATOR)- 检查目标时出错：预期 activation_1 具有 3 个维度，但得到的数组形状为 (32, 416, 608, 3)

问题描述

3 个解决方案

解决方案1 1 已采纳 2018-04-17 12:04:24

解决方案2 0 2018-04-17 17:46:42

解决方案3 0 2018-07-14 05:16:53

解决方案1
1 已采纳 2018-04-17 12:04:24

解决方案2
0 2018-04-17 17:46:42

解决方案3
0 2018-07-14 05:16:53