I'm designing the architecture of my ETL for an ML related project.
Generally when automating tasks in AWS I use EventBridge , SQS , S3 as triggers if using Lambdas. In my case Lambdas doesn't fit my needs so I decided to go with Amazon Sagemaker Processing Jobs. These processors could scale with instance type and number of instances , also they have an expiration time of 24 hours, this are the requirements that Lambda couldn't achieve.
The architecture that I generally use:
But as you can imaging that Lambda layer is only used for launching Sagemaker Processing Jobs so it is desirable to avoid it.
Questions :
Q1. There is a better architecture with AWS services in order to automate/trigger/schedule Sagemaker Processing Jobs?
Q2. Which kind of services could perform better this task?
You can use SageMaker pipelines to trigger a SageMaker processing job, it has native integration with EventBridge (for example, trigger pipeline if an object is uploaded to a specific S3 bucket). See the integration here - https://docs.aws.amazon.com/sagemaker/latest/dg/pipeline-eventbridge.html
Here are a few samples to get started with pipelines - https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-pipelines/index.html . You only pay for the jobs executed in the pipeline, not for the orchestration itself.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.