更改 AWS SageMaker 训练作业上的 model 文件保存位置

Question

I am trying to run custom python/sklearn sagemaker script on AWS, basically learning from these examples: https://github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_randomforest/Sklearn_on_SageMaker_end2end.ipynb我正在尝试在 AWS 上运行自定义 python/sklearn sagemaker 脚本，基本上从这些示例中学习： https://github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_randomforest/Sklearn_on_SageMaker_end2end。 ipynb

All works fine, if define the arguments, train the model and output the file:一切正常，如果定义 arguments，训练 model 和 output 文件：

parser.add_argument('--model-dir', type=str, default=os.environ.get('SM_MODEL_DIR'))
parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN'))
parser.add_argument('--test', type=str, default=os.environ.get('SM_CHANNEL_TEST'))

# train the model...

joblib.dump(model, os.path.join(args.model_dir, "model.joblib"))

And call the job with:并通过以下方式调用工作：

aws_sklearn.fit({'train': 's3://path/to/train', 'test': 's3://path/to/test'}, wait=False)

In this case model gets stored on different auto-generated bucket, which I do not want.在这种情况下，model 存储在不同的自动生成的存储桶中，这是我不想要的。 I want to get the output (.joblib file) in the same s3 bucket I took data from.我想在我从中获取数据的同一个 s3 存储桶中获取 output（.joblib 文件）。 So I add the parameter model-dir :所以我添加了参数model-dir ：

aws_sklearn.fit({'train': 's3://path/to/train', 'test': 's3://path/to/test', `model-dir`: 's3://path/to/model'}, wait=False)

But it results in error: FileNotFoundError: [Errno 2] No such file or directory: 's3://path/to/model/model.joblib'但它会导致错误： FileNotFoundError: [Errno 2] No such file or directory: 's3://path/to/model/model.joblib'

Same happens if I hardcode the output path inside the training script.如果我在训练脚本中硬编码 output 路径，也会发生同样的情况。

So the main question, how can I get the output file in the bucket of my choice?所以主要问题是，如何在我选择的存储桶中获取 output 文件？

Answer 1

You can use parameter output_path when you define the estimator.您可以在定义估算器时使用参数output_path 。 If you use the model_dir I guess you have to create that bucket beforehand, but you have the advantage that artifacts can be saved in real time during the training (if the instance has rights on S3).如果您使用model_dir ，我想您必须事先创建该存储桶，但您的优势是可以在训练期间实时保存工件（如果实例在 S3 上拥有权限）。 You can take a look at my repo for this specific case.您可以查看我针对此特定案例的存储库。

更改 AWS SageMaker 训练作业上的 model 文件保存位置

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-01-13 13:47:25

更改 AWS SageMaker 训练作业上的 model 文件保存位置

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-01-13 13:47:25

解决方案1
2 已采纳 2021-01-13 13:47:25