Keras 去噪自動編碼器 - logits 和標簽必須具有相同的第一維，得到 logits 形狀 [986624,38] 和標簽形狀 [32]

Question

我正在嘗試為面部識別項目構建一個去噪自動編碼器，在最初的測試中，我使用裁剪后的 yalefaces 數據集，將訓練（嘈雜）圖像放在一個文件夾中（每個班級/人都有單獨的文件夾）和在另一個具有相同結構的圖像中測試（常規）圖像。 但是每次測試我都有以下錯誤：

InvalidArgumentError：logits 和標簽必須具有相同的第一維，得到 logits 形狀 [986624,38] 和標簽形狀 [32] [[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits（定義在 \Desktop\projetos\conv autoencoder teste.py:75） ]] [操作：__inference_train_function_8691]

函數調用棧：train_function

我正在使用 Keras 2.6.0。

batch_size = 32
height = 168
width = 192
depth = 1
chanDim = -1
inputShape = (height, width, depth)

data_dir='C:\\Users\\User\\Desktop\\projetos\\Test1\\Data'
train_data_dir='C:\\Users\\User\\Desktop\\projetos\\Test1\\Test_Images\\sp_noise'
images_noisy = tf.keras.preprocessing.image_dataset_from_directory(directory=train_data_dir, labels='inferred', label_mode='int',class_names=None, color_mode='grayscale', batch_size=batch_size, image_size=(height,width),shuffle=True,seed=2457,interpolation='bilinear')
images_regular = tf.keras.preprocessing.image_dataset_from_directory(directory=data_dir, labels='inferred', label_mode='int',class_names=None, color_mode='grayscale', batch_size=batch_size, image_size=(height,width),shuffle=True,seed=2457,interpolation='bilinear')

datagen = tf.keras.preprocessing.image.ImageDataGenerator()
train_it = datagen.flow_from_directory(train_data_dir, class_mode='sparse', batch_size=32,target_size=(height, width),color_mode='grayscale')
val_it = datagen.flow_from_directory(data_dir, class_mode='sparse', batch_size=32,target_size=(height, width),color_mode='grayscale')

#input = tf.keras.layers.Input(shape=(inputShape))

Input_img = Input(shape=(168,192,1))  
#Input_img = Input(shape=(None))
    
#encoding architecture
#x1 = tf.keras.layers.Reshape((168, 192, 1), input_shape=(None, 168, 192, 1))(Input_img)
x1 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu', padding='same')(Input_img)
x1 = tf.keras.layers.MaxPooling2D( (2, 2), padding='same')(x1)
x2 = tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x1)
x2 = tf.keras.layers.MaxPooling2D( (2, 2), padding='same')(x2)
x3 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', padding='same')(x2)
encoded    = tf.keras.layers.MaxPooling2D( (2, 2), padding='same')(x3)
    
# decoding architecture
x3 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', padding='same')(encoded)
x3 = tf.keras.layers.UpSampling2D((2, 2))(x3)
x2 = tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x3)
x2 = tf.keras.layers.UpSampling2D((2, 2))(x2)
x1 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu')(x2)
x1 = tf.keras.layers.UpSampling2D((2, 2))(x1)
decoded   = tf.keras.layers.Conv2D(38, (3, 3), activation='sigmoid', padding='same')(x1)

autoencoder = Model(Input_img, decoded)
autoencoder.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False))

history = autoencoder.fit(
    images_noisy,
    epochs=20,
    batch_size=32,
    shuffle=True,
    validation_data=(images_regular))

autoencoder.summary()

在這一點上，坦率地說，我不知道是什么導致了這個問題。 我在面部識別/分類 cnn 中使用了帶有 image_dataset_from_directory 函數的數據集，沒有任何問題，但這里似乎沒有任何效果。

Answer 1

我能夠重現錯誤，輸入維度和輸出維度在自動編碼器中需要相同。 如下更改解碼器的架構會有所幫助。

#decoding architecture 
x3 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', padding='same')(encoded)
x3 = tf.keras.layers.UpSampling2D((2, 2))(x3)
x2 = tf.keras.layers.Conv2D(1, (3, 3), activation='relu', padding='same')(x3)
x1 = tf.keras.layers.UpSampling2D((2, 2))(x2)
decoded = tf.keras.layers.UpSampling2D((2, 2))(x1)

Keras 去噪自動編碼器 - logits 和標簽必須具有相同的第一維，得到 logits 形狀 [986624,38] 和標簽形狀 [32]

問題描述

1 個解決方案

解決方案1
0 2022-07-06 07:32:46

Keras 去噪自動編碼器 - logits 和標簽必須具有相同的第一維，得到 logits 形狀 [986624,38] 和標簽形狀 [32]

問題描述

1 個解決方案

解決方案1 0 2022-07-06 07:32:46

解決方案1
0 2022-07-06 07:32:46