弄清楚如何为自己的数据集在Keras的Conv2D层中定义input_shape

Question

TL，DR

定义输入形状时出现这些错误

ValueError: Error when checking input: expected conv2d_1_input to have 4 dimensions, but got array with shape (4000, 20, 20)

要么

ValueError: Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=5

长显式版本：

我正在使用不同的Keras NN尝试对自己的数据集进行分类。

到目前为止，我的ANN取得了成功，但是CNN遇到了麻烦。

数据集

完整的代码

数据集由指定大小的矩阵组成，并填充有0，其中包含指定大小的子矩阵，并填充有1。 子矩阵是可选的，目标是训练NN以预测矩阵是否包含子矩阵。 为了使检测更加困难，我将各种噪声添加到矩阵中。

这是单个矩阵散乱的图片，黑色部分为0，白色部分为1。 图像的像素与矩阵中的条目之间存在1：1的对应关系。

我同时使用numpy savetxt和loadtxt将它们保存为文本。 然后看起来像这样：

#________________Array__Info:__(4000, 20, 20)__________
#________________Entry__Number__1________
0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 1
0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 1
0 0 1 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 0
0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1
0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 1 1
0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
0 0 1 1 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0
0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1
1 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0
#________________Entry__Number__2________
0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0
1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1
1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0
0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0
0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0
1 0 1 0 0 1 0 1 0 1 0 0 0 0 1 1 1 0 0 1
0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0
0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1
0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 0
0 0 1 1 0 0 0 0 0 0 0 1 1 1 1 1 0 1 0 0
0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1 0 0 0 1
0 1 0 0 0 0. . . . . . (and so on)

完整的数据集

CNN代码

Github上

代码：（不包括进口商品）

# data

inputData = dsg.loadDataset("test_input.txt")
outputData = dsg.loadDataset("test_output.txt")
print("the size of the dataset is: ", inputData.shape, " of type: ", type(inputData))


# parameters

# CNN

cnn = Sequential()

cnn.add(Conv2D(32, (3, 3), input_shape = inputData.shape, activation = 'relu'))

cnn.add(MaxPooling2D(pool_size = (2, 2)))

cnn.add(Flatten())

cnn.add(Dense(units=64, activation='relu'))

cnn.add(Dense(units=1, activation='sigmoid'))

cnn.compile(optimizer = "adam", loss = 'binary_crossentropy', metrics = ['accuracy'])

cnn.summary()

cnn.fit(inputData,
        outputData,
        epochs=100,
        validation_split=0.2)

问题：

我收到此输出错误消息

Using TensorFlow backend.
the size of the dataset is:  (4000, 20, 20)  of type:  <class 'numpy.ndarray'>
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 3998, 18, 32)      5792      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 1999, 9, 32)       0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 575712)            0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                36845632  
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 65        
=================================================================
Total params: 36,851,489
Trainable params: 36,851,489
Non-trainable params: 0
_________________________________________________________________
Traceback (most recent call last):
  File "D:\GOOGLE DRIVE\School\sem-2-2018\BSP2\BiCS-BSP-2\CNN\matrixCNN.py", line 47, in <module>
    validation_split=0.2)
  File "C:\Code\Python\lib\site-packages\keras\models.py", line 963, in fit
    validation_steps=validation_steps)
  File "C:\Code\Python\lib\site-packages\keras\engine\training.py", line 1637, in fit
    batch_size=batch_size)
  File "C:\Code\Python\lib\site-packages\keras\engine\training.py", line 1483, in _standardize_user_data
    exception_prefix='input')
  File "C:\Code\Python\lib\site-packages\keras\engine\training.py", line 113, in _standardize_input_data
    'with shape ' + str(data_shape))
ValueError: Error when checking input: expected conv2d_1_input to have 4 dimensions, but got array with shape (4000, 20, 20)

我真的不知道该如何解决。 我查看了Conv2D的文档，该文档说要以这样的形式放置：（批处理，高度，宽度，通道）。 在我的情况下（我认为）：

input_shape=(4000, 20, 20, 1)

，因为我只有4000个20 * 20矩阵，且只有1和0

但是然后我收到此错误消息：

Using TensorFlow backend.
the size of the dataset is:  (4000, 20, 20)  of type:  <class 'numpy.ndarray'>
Traceback (most recent call last):
  File "D:\GOOGLE DRIVE\School\sem-2-2018\BSP2\BiCS-BSP-2\CNN\matrixCNN.py", line 30, in <module>
    cnn.add(Conv2D(32, (3, 3), input_shape = (4000, 12, 12, 1), activation = 'relu'))
  File "C:\Code\Python\lib\site-packages\keras\models.py", line 467, in add
    layer(x)
  File "C:\Code\Python\lib\site-packages\keras\engine\topology.py", line 573, in __call__
    self.assert_input_compatibility(inputs)
  File "C:\Code\Python\lib\site-packages\keras\engine\topology.py", line 472, in assert_input_compatibility
    str(K.ndim(x)))
ValueError: Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=5

我应以哪种精确形状将数据传递到CNN？

所有文件都在这里可用谢谢您的宝贵时间。

Answer 1

您的CNN期望形状为(num_samples, 20, 20, 1) ，而数据的格式为(num_samples, 20, 20) 。

由于您只有1个通道，因此您可以将数据重塑为(4000, 20, 20, 1)

inputData = inputData.reshape(-1, 20, 20, 1)

如果要在模型内部进行重塑，只需添加一个Reshape图层即可。 作为第一层：

model.add(Reshape(input_shape = (20, 20), target_shape=(20, 20, 1)))

Answer 2

多亏了Primusa的帮助和我发现的另一个线程，我才开始工作。 这是我添加的内容：

inputData = inputData.reshape(4000, 20, 20, 1)
outputData = outputData.reshape(4000, 1)

与conv2D层是

cnn.add(Conv2D(32, (3, 3), input_shape = (20, 20, 1), activation = 'relu'))

弄清楚如何为自己的数据集在Keras的Conv2D层中定义input_shape

问题描述

TL，DR

长显式版本：

数据集

CNN代码

问题：

2 个解决方案

解决方案1
2 已采纳 2018-04-15 16:41:34

解决方案2
1 2018-04-15 21:44:29

弄清楚如何为自己的数据集在Keras的Conv2D层中定义input_shape

问题描述

TL，DR

长显式版本：

数据集

CNN代码

问题：

2 个解决方案

解决方案1 2 已采纳 2018-04-15 16:41:34

解决方案2 1 2018-04-15 21:44:29

解决方案1
2 已采纳 2018-04-15 16:41:34

解决方案2
1 2018-04-15 21:44:29