简体繁体中英

how to save uncompressed outputs from a training job in using aws Sagemaker python SDK?

原文 2020-12-23 08:15:07 2 2 python/ boto3/ amazon-sagemaker

I'm trying to upload training job artifacts to S3 in a non-compressed manner.

I am familiar with the output_dir one can provide to a sagemaker Estimator, then everything saved under /opt/ml/output is uploaded compressed to the S3 output dir.

I want to have the option to access a specific artifact without having to decompress the output every time. Is there a clean way to go about it? if not any workaround in mind? The artifacts of my interest are small meta-data files.txt or.csv, while in my case the rest of the artifacts can be ~1GB so downloading and decompressing is quite excessive.

any help would be appreciated

2 answers

I ended up using the checkpoint path that is by default being synced with the specified S3 path in an uncompressed manner.

I think you can simply specify an s3 location path to save your artifact within your training script. However I'm not totally sure that instances created by sagemaker have permission to directly write to S3, maybe they are also network isolated. I'm doing more or less what you say in order to read in real time tensorflow logs, but I'm using a custom image for the training. If you are interested you can take a look here

AWS SageMaker, describe a specific training job using python SDK

Change model file save location on AWS SageMaker Training Job

When using a Tensorflow Estimator in AWS Sagemaker, will the training job automatically save the model artifacts to /opt/ml/model?

How to specify source directory and entry point for a SageMaker training job using Boto3 SDK? The use case is start training via Lambda call

How can I specify content_type in a training job of XGBoost from Sagemaker in Python?

end to end example of training (regression) model using SageMaker Python SDK from visual studio code

Training Job is Stopping in Sagemaker

Using external libraries for model training in aws sagemaker

Calling AWS Sagemaker endpoint from Glue Job

How to specify max runtime using TensorFlow estimator with Python Sagemaker SDK?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question AWS SageMaker, describe a specific training job using python SDK Change model file save location on AWS SageMaker Training Job When using a Tensorflow Estimator in AWS Sagemaker, will the training job automatically save the model artifacts to /opt/ml/model? How to specify source directory and entry point for a SageMaker training job using Boto3 SDK? The use case is start training via Lambda call How can I specify content_type in a training job of XGBoost from Sagemaker in Python? end to end example of training (regression) model using SageMaker Python SDK from visual studio code Training Job is Stopping in Sagemaker Using external libraries for model training in aws sagemaker Calling AWS Sagemaker endpoint from Glue Job How to specify max runtime using TensorFlow estimator with Python Sagemaker SDK?

Related Tags

how to save uncompressed outputs from a training job in using aws Sagemaker python SDK?

Question

2 answers

solution1
1 ACCPTED 2021-01-31 09:33:25

solution2
0 2020-12-23 14:53:40

how to save uncompressed outputs from a training job in using aws Sagemaker python SDK?

Question

2 answers

solution1 1 ACCPTED 2021-01-31 09:33:25

solution2 0 2020-12-23 14:53:40

solution1
1 ACCPTED 2021-01-31 09:33:25

solution2
0 2020-12-23 14:53:40