为什么我们将 Mnist 训练图像重塑为 (60000,28,28,1) 而不是像这样直接使用 (60,28,28)？

Question

此代码用于训练 model 以使用 Mnist 数据集进行图像分类。 我不明白的是为什么我们将训练图像重塑为（60000,28,28,1），而不是像这样直接使用它（60,28,28）。

num_classes = 10
input_shape = (28, 28, 1)


(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

#print(x_train[0])

x_train = x_train.astype("float32") / 255 

#print(x_train[0])

x_test = x_test.astype("float32") / 255

print(x_train.shape)
print(x_test.shape)

x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
print("x_train shape:", x_train.shape)
print("x_train shape:", x_test.shape)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")

print()
print(y_train)

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

print()
print(y_train)

Answer 1

在机器学习中，理解数据非常重要，就像这个案例一样。 有 60000 张训练图像开始，10000 张图像用于测试目的。

每张图片的大小为 28*28 像素； 即 28 像素高度和 28 像素宽度，因此 (28, 28, 1), 1 在最后一部分是指定像素的颜色深度。 1 用于灰度图像（黑白图像）。

所以在这里使用 (60, 28, 28, 1) 是不可能的。 现在我们为什么要使用 (60000, 28, 28, 1) - 这是我们数据的矩阵形状，因为我们有 60000 张图像，其中 28*28 像素，每个像素在这个矩阵中都有一个值。

为简化起见，假设我们只有一张图像，那么它就像 (1, 28, 28, 1) 并且可以很容易地以矩阵形式写成 28*28 矩阵。

Answer 2

重塑以适应其他架构非常重要，例如使用tensorflow.image.resize需要至少 75*75 的 inceptionv3

x_train = tensorflow.image.resize(x_train, [75,75])
x_test = tensorflow.image.resize(x_test, [75,75])

为什么我们将 Mnist 训练图像重塑为 (60000,28,28,1) 而不是像这样直接使用 (60,28,28)？

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-02-22 19:55:34

解决方案2
0 2022-05-26 07:40:58

为什么我们将 Mnist 训练图像重塑为 (60000,28,28,1) 而不是像这样直接使用 (60,28,28)？

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-02-22 19:55:34

解决方案2 0 2022-05-26 07:40:58

解决方案1
1 已采纳 2021-02-22 19:55:34

解决方案2
0 2022-05-26 07:40:58