简体   繁体   中英

how to save uncompressed outputs from a training job in using aws Sagemaker python SDK?

I'm trying to upload training job artifacts to S3 in a non-compressed manner.

I am familiar with the output_dir one can provide to a sagemaker Estimator, then everything saved under /opt/ml/output is uploaded compressed to the S3 output dir.

I want to have the option to access a specific artifact without having to decompress the output every time. Is there a clean way to go about it? if not any workaround in mind? The artifacts of my interest are small meta-data files.txt or.csv, while in my case the rest of the artifacts can be ~1GB so downloading and decompressing is quite excessive.

any help would be appreciated

I ended up using the checkpoint path that is by default being synced with the specified S3 path in an uncompressed manner.

I think you can simply specify an s3 location path to save your artifact within your training script. However I'm not totally sure that instances created by sagemaker have permission to directly write to S3, maybe they are also network isolated. I'm doing more or less what you say in order to read in real time tensorflow logs, but I'm using a custom image for the training. If you are interested you can take a look here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM