[英]Change model file save location on AWS SageMaker Training Job
I am trying to run custom python/sklearn sagemaker script on AWS, basically learning from these examples: https://github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_randomforest/Sklearn_on_SageMaker_end2end.ipynb我正在尝试在 AWS 上运行自定义 python/sklearn sagemaker 脚本,基本上从这些示例中学习: https://github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_randomforest/Sklearn_on_SageMaker_end2end。 ipynb
All works fine, if define the arguments, train the model and output the file:一切正常,如果定义 arguments,训练 model 和 output 文件:
parser.add_argument('--model-dir', type=str, default=os.environ.get('SM_MODEL_DIR'))
parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN'))
parser.add_argument('--test', type=str, default=os.environ.get('SM_CHANNEL_TEST'))
# train the model...
joblib.dump(model, os.path.join(args.model_dir, "model.joblib"))
And call the job with:并通过以下方式调用工作:
aws_sklearn.fit({'train': 's3://path/to/train', 'test': 's3://path/to/test'}, wait=False)
In this case model gets stored on different auto-generated bucket, which I do not want.在这种情况下,model 存储在不同的自动生成的存储桶中,这是我不想要的。 I want to get the output (.joblib file) in the same s3 bucket I took data from.
我想在我从中获取数据的同一个 s3 存储桶中获取 output(.joblib 文件)。 So I add the parameter
model-dir
:所以我添加了参数
model-dir
:
aws_sklearn.fit({'train': 's3://path/to/train', 'test': 's3://path/to/test', `model-dir`: 's3://path/to/model'}, wait=False)
But it results in error: FileNotFoundError: [Errno 2] No such file or directory: 's3://path/to/model/model.joblib'
但它会导致错误:
FileNotFoundError: [Errno 2] No such file or directory: 's3://path/to/model/model.joblib'
Same happens if I hardcode the output path inside the training script.如果我在训练脚本中硬编码 output 路径,也会发生同样的情况。
So the main question, how can I get the output file in the bucket of my choice?所以主要问题是,如何在我选择的存储桶中获取 output 文件?
You can use parameter output_path
when you define the estimator.您可以在定义估算器时使用参数
output_path
。 If you use the model_dir
I guess you have to create that bucket beforehand, but you have the advantage that artifacts can be saved in real time during the training (if the instance has rights on S3).如果您使用
model_dir
,我想您必须事先创建该存储桶,但您的优势是可以在训练期间实时保存工件(如果实例在 S3 上拥有权限)。 You can take a look at my repo for this specific case.您可以查看我针对此特定案例的存储库。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.