How to create a pipeline in sagemaker with pytorch

Question

I am dealing with a classification problem with text data in sagemaker. Where, i first fit and transform it into structured format(say by using TFIDF in sklearn) then i kept the result in S3 bucket and i used it for training my pytorch model for which i have written the code in my entry point.

if we notice, by the end of the above process, i have two models

sklearn TFIDF model
actual PyTorch model

So, when every time i need to predict on a new text data, i need to separately process(transform) the text data with TFIDF model which i created during my training.

How can i create a pipeline in sagemaker with sklearn's TFIDF and pytorch models.

if i fit and transform text data using TFIDF in my main method in entrypoint then if i train my pytorch model in my main method, i can return only one model which will be used in model_fn()

Answer 1

First, checkout the mnist example here:

https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/pytorch_mnist/pytorch_mnist.ipynb

With script mode, you can run the code (in mnist.py) using the below estimator.

from sagemaker.pytorch import PyTorch

estimator = PyTorch(entry_point='mnist.py',
                    role=role,
                    framework_version='1.1.0',
                    train_instance_count=2,
                    train_instance_type='ml.c4.xlarge',
                    hyperparameters={
                        'epochs': 6,
                        'backend': 'gloo'
                    })

Simply update the mnist.py script as per tfidf pipeline. Hope this helps.

Answer 2

Apparently, We need to use inference pipelines.

An inference pipeline is an Amazon SageMaker model that is composed of a linear sequence of two to five containers that process requests for inferences on data . You use an inference pipeline to define and deploy any combination of pretrained Amazon SageMaker built-in algorithms and your own custom algorithms packaged in Docker containers. You can use an inference pipeline to combine preprocessing, predictions, and post-processing data science tasks. Inference pipelines are fully managed.

one can read the docs here -

https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipelines.html

Example -

https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_inference_pipeline/Inference%20Pipeline%20with%20Scikit-learn%20and%20Linear%20Learner.ipynb

How to create a pipeline in sagemaker with pytorch

Question

2 answers

solution1
0 2019-09-03 12:43:24

solution2
0 2019-09-03 13:07:12

How to create a pipeline in sagemaker with pytorch

Question

2 answers

solution1 0 2019-09-03 12:43:24

solution2 0 2019-09-03 13:07:12

solution1
0 2019-09-03 12:43:24

solution2
0 2019-09-03 13:07:12