[英]steps_per_epoch and validation_steps for infinite Dataset in Keras Model
I have a huge dataset of csv files having a volume of around 200GB.我有一个包含 csv 个文件的庞大数据集,文件大小约为 200GB。 I don't know the total number of records in the dataset.我不知道数据集中的记录总数。 I'm using make_csv_dataset to create a PreFetchDataset generator.我正在使用make_csv_dataset创建一个 PreFetchDataset 生成器。
I'm facing problem when Tensorflow complains to specify steps_per_epoch and validation_steps for infinite dataset....当 Tensorflow 抱怨为无限数据集指定 steps_per_epoch 和 validation_steps 时,我遇到了问题....
How can I specify the steps_per_epoch and validation_steps?如何指定 steps_per_epoch 和 validation_steps?
Can I pass these parameters as the percentage of total dataset size?我可以将这些参数作为总数据集大小的百分比传递吗?
Can I somehow avoid these parameters as I want my whole dataset to be iterated for each epoch?我可以以某种方式避免这些参数,因为我希望我的整个数据集在每个时期都被迭代吗?
I think this SO thread answer the case when we know to total number of data records in advance.我认为当我们提前知道数据记录总数时, 这个 SO线程会回答这种情况。
Here is a screenshot from documentation.这是文档的屏幕截图。 But I'm not getting it properly.但我没有得到正确的。
What does the last line mean?最后一行是什么意思?
I see no other option than iterating through your entire dataset.除了遍历整个数据集,我看不到其他选择。
ds = tf.data.experimental.make_csv_dataset('myfile.csv', batch_size=16, num_epochs=1)
for ix, _ in enumerate(ds, 1):
pass
print('The total number of steps is', ix)
Don't forget the num_epochs
argument.不要忘记num_epochs
参数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.