[英]For an infinite dataset, is the data used in each epoch the same?
在tensorflow中,假設我有一個來自generator的數據集:
dataset = tf.data.Dataset.from_generator(gen...)
並且此生成器生成無限的非重復數據(就像無限的非循環小數)一樣。
model.fit(dataset, steps_per_epoch=10000, epochs=5)
現在,在這5個訓練時期內,使用的數據是否相同? 即總是從發生器的前10000個項目? 而不是第1階段的0-9999,第2階段的10000-19999,等等。
那initial_epoch
參數呢? 如果將其設置為1,將從第10000個項目開始訓練模型嗎?
model.fit(dataset, steps_per_epoch=10000, epochs=5, initial_epoch=1)
更新:此簡單的測試表明,每次調用model.fit()
時都會重置數據集
def gen():
i = 1
while True:
yield np.array([[i]]), np.array([[0]])
i += 1
ds = tf.data.Dataset.from_generator(gen, output_types=(tf.int32, tf.int32)).batch(3)
x = Input(shape=(1, 1))
model = Model(inputs=x, outputs=x)
model.compile('adam', loss=lambda true, pred: tf.reduce_mean(pred))
for i in range(10):
model.fit(ds, steps_per_epoch=5, epochs=1)
輸出:
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 9ms/step - loss: 8.0000
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 2ms/step - loss: 8.0000
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 2ms/step - loss: 8.0000
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 2ms/step - loss: 8.0000
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 2ms/step - loss: 8.0000
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 2ms/step - loss: 8.0000
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 2ms/step - loss: 8.0000
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 2ms/step - loss: 8.0000
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 2ms/step - loss: 8.0000
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 2ms/step - loss: 8.0000
1次通話中有5個紀元:
model.fit(ds, steps_per_epoch=5, epochs=5)
輸出:
Epoch 1/5
1/5 [=====>........................] - ETA: 0s - loss: 2.0000
5/5 [==============================] - 0s 9ms/step - loss: 8.0000
Epoch 2/5
1/5 [=====>........................] - ETA: 0s - loss: 17.0000
5/5 [==============================] - 0s 2ms/step - loss: 23.0000
Epoch 3/5
1/5 [=====>........................] - ETA: 0s - loss: 32.0000
5/5 [==============================] - 0s 2ms/step - loss: 38.0000
Epoch 4/5
1/5 [=====>........................] - ETA: 0s - loss: 47.0000
5/5 [==============================] - 0s 2ms/step - loss: 53.0000
Epoch 5/5
1/5 [=====>........................] - ETA: 0s - loss: 62.0000
5/5 [==============================] - 0s 2ms/step - loss: 68.0000
不,使用的數據不同。 keras
使用steps_per_epoch
確定每個epoch
的長度(因為生成器沒有長度),因此它知道何時結束訓練(或調用檢查點指針等)。
initial_epoch
是顯示給紀元的數字,當您要從檢查點重新開始訓練時很有用(請參見fit method ),它與數據迭代無關。
如果將相同的dataset
傳遞給model.fit
方法,它將在每次函數調用后重置(感謝信息OP)。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.