简体   繁体   中英

How can we orchestrate and automate data movement and data transformation in AWS sagemaker pipeline

I'm migrating our ML notebooks from Azure Databricks to AWS environment using Sagemaker and Step functions. I have separate notebooks for data processing, feature engineering and ML algorithms which I want to run in a sequence after completion of previous notebook. Can you help me any resource which shows to execute sagemaker notebooks in a sequence using AWS step?

For this type of architecture you need to involve some other elements of the aws as well. The other services which might be helpful to achieve this is using the combination of eventbridge (scheduled rules) which will execute lambda and then reaches to sagemaker where you can execute you notebooks.

A new feature allows you to Operationalize your Amazon SageMaker Studio notebooks as scheduled notebook jobs . Unfortunately there no way yet to tie them together into a pipeline.
The other alternative would be to convert your notebooks to processing and training jobs, and use something like AWS Step Functions, or SageMaker Pipelines to run them as a pipeline.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM