如何正确使用 Tensorflow dataset.cache()

Question

我的 tensorflow 版本是 2.6.0，我尝试使用 dataset.cache(dir_1) 将我的数据集缓存在磁盘上。 但是当我使用 chached 日期集训练我的 model 时，结果证明 model.evaluate() 和 model.train(.) 之间的训练集精度不同。

all_data_dir = 'D:\\jupyterWorkingSpace\\image_data\\feiyan X'
all_data_dir = pathlib.Path(all_data_dir)
batch_size = 8
img_height = 512
img_width = 512
train_ds = tf.keras.utils.image_dataset_from_directory('D:\\jupyterWorkingSpace\\image_data\\feiyan X\\train', 
                                                       seed=123,
                                                       label_mode = 'binary',
                                                       shuffle=True,
                                                       image_size=(img_height, img_width),
                                                       batch_size=batch_size)
val_ds = tf.keras.utils.image_dataset_from_directory('D:\\jupyterWorkingSpace\\image_data\\feiyan X\\test', 
                                                       seed=123,
                                                       label_mode = 'binary',
                                                       shuffle=True,
                                                       image_size=(img_height, img_width),
                                                       batch_size=batch_size)
normalization_layer = keras.layers.Rescaling(1./255)
AUTOTUNE = tf.data.AUTOTUNE
dir_1 = './train_cache/a'
dir_2 = './val_cache/a'

normalized_train_ds_ = train_ds.cache(dir_1).shuffle(buffer_size=20).map(lambda x, y: (normalization_layer(x), y)).prefetch(AUTOTUNE)
normalized_val_ds_ = val_ds.cache(dir_2).shuffle(buffer_size=20).map(lambda x, y: (normalization_layer(x), y)).prefetch(AUTOTUNE)

...

loss = weightedBCE(normal_train_size,feiyan_train_size)


exponential_decay_fn = exponential_decay(lr0=0.0005, s=10)
lr_scheduler = keras.callbacks.LearningRateScheduler(exponential_decay_fn)
earlystop = keras.callbacks.EarlyStopping(monitor='val_accuracy', min_delta=0.001 ,patience=20,mode='max')
optimizer = keras.optimizers.SGD(learning_rate=0.00005,momentum=0.9,nesterov=True)


model.compile(loss=loss,optimizer=optimizer,metrics=['accuracy'])
savebestmodel = keras.callbacks.ModelCheckpoint('1.h5', 
                                                monitor = 'val_accuracy', 
                                                verbose = 1, 
                                                save_best_only = True, 
                                                mode = 'auto')

history = model.fit((normalized_train_ds_),\
                     epochs=1,validation_data=(normalized_val_ds_),\
                     callbacks=[savebestmodel,earlystop,lr_scheduler])

训练精度评估精度

Answer 1

问题已解决。 这是因为 ResNet50 的错误冻结造成的。

如何正确使用 Tensorflow dataset.cache()

问题描述

1 个解决方案

解决方案1
0 2022-01-10 07:47:39

如何正确使用 Tensorflow dataset.cache()

问题描述

1 个解决方案

解决方案1 0 2022-01-10 07:47:39

解决方案1
0 2022-01-10 07:47:39