将 tensorflow model 从本地机器转移到 AWS SageMaker 时读取 S3 存储桶时出现问题

Question

在 Python 中的本地机器上进行测试时，我通常会使用以下内容来读取包含所有类和文件/类的子目录的训练集：

train_path = r"C:\temp\coins\PCGS - Gold\train"

train_batches = ImageDataGenerator().flow_from_directory(train_path, target_size=(100,100), classes=['0','1',2','3' etc...], batch_size=32)

找到属于 22 个类别的 4100 张图像。

但在 AWS SageMaker 的 Jupyter 笔记本上，我现在从 S3 存储桶中提取文件。 我尝试了以下内容：

bucket = "coinpath"

train_path = 's3://{}/{}/train'.format(bucket, "v1")   #note that the directory structure is coinpath/v1/train where coinpath is the bucket

train_batches = ImageDataGenerator().flow_from_directory(train_path, target_size=(100,100), classes=
['0','1',2','3' etc...], batch_size=32)

但我得到：**找到属于 22 个类的 0 个图像。**

寻找有关从 S3 中提取训练数据的正确方法的一些指导。

Answer 1

从Ideal way to read data in bucket stored batches for Keras ML training in Google Cloud Platform? “ImageDataGenerator.flow_from_directory() 当前不允许您直接从 GCS 存储桶获取 stream 数据。”

我必须先从 S3 下载图像。 这也是延迟原因的最佳选择。

将 tensorflow model 从本地机器转移到 AWS SageMaker 时读取 S3 存储桶时出现问题

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-05-18 02:23:32

将 tensorflow model 从本地机器转移到 AWS SageMaker 时读取 S3 存储桶时出现问题

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-05-18 02:23:32

解决方案1
0 已采纳 2020-05-18 02:23:32