what is a optimal setting for a sagemaker batch job?

Question

Based on AWS documentation, docs , I've set up a batch inference job. however, once we choose the instance type and instance count, bare minimum, does sagemaker choose optimal plan to process jobs, say if there are more than one files, and if resource are available, can those files in parallel?

from sagemaker.transformer import Transformer

tr = Transformer(model_name='custom_model',instance_count=2, instance_type='ml.m4.xlarge')

Answer 1

Batch partitions the Amazon S3 objects in the input by key. Please checkout this

When you have multiple input files to process, you can set the BatchStrategy to MultiLine in order to speed up the processing.

General guideline is - number of workers/instances is a multiple of number of files in S3 to process. If MaxConcurrentTransforms is set to 0 or left unset, Amazon SageMaker checks the optional execution-parameters to determine the settings for your chosen algorithm

what is a optimal setting for a sagemaker batch job?

Question

1 answers

solution1
1 2022-03-31 23:13:33

what is a optimal setting for a sagemaker batch job?

Question

1 answers

solution1 1 2022-03-31 23:13:33

solution1
1 2022-03-31 23:13:33