Pass a config file to Sagemaker training program

Question

Setup:

I have gone for the bring your own container option for AWS Sagemaker Training. In the Dockerfile, I specify the SAGEMAKER_PROGRAM variable to point to tools/train.py as I am working with mmaction2 repo.

So a user is executing

estimator = PyTorch(
    role='sagemaker_role',
    image_uri="path_in_ecr",
    instance_count=1,
    instance_type="ml.g4dn.xlarge",
    volume_size=40,
    output_path=f"s3://{bucket}/{prefix_output}/",
    sagemaker_session=sagemaker_session,
    max_run=3600 * 2,
)

estimator.fit()

on an ec2 machine where say they have a config in /home/ubuntu/train_config_mmaction2.py

Problem: Since mmaction2 requires a config file as input which specifies the training config, how can I pass a file to Sagemaker Training so that it is copied from the calling ec2 instance to the training instance and used as a command line argument for the SAGEMAKER_PROGRAM defined in the Dockerfile?

I tried using the entrypoint and source_code argument provided in the pytorch class where the entrypoint and the config is in the source_code directory so that the config would be copied. However, this creates a dependency on have the entrypoint present locally for each run. I am wondering if there is a way to do this without having this dependency

Answer 1

Hey you can do multiple things:

Either have the config file in the source_dir , along with the entry point. This doesn't have to be local, it can also come from a git repo, as indicated here: blog , demo
Or you could bring the config file via S3, using SageMaker input or checkpoint channels ( doc )

Pass a config file to Sagemaker training program

Question

1 answers

solution1
0 ACCPTED 2022-12-12 08:05:32

Pass a config file to Sagemaker training program

Question

1 answers

solution1 0 ACCPTED 2022-12-12 08:05:32

solution1
0 ACCPTED 2022-12-12 08:05:32