简体繁体中英

Azure model deployment (Real-time Endpoints vs Compute Inference Cluster)

原文 2022-11-24 12:10:31 6 1 machine-learning/ azure-functions/ azure-machine-learning-studio/ azure-container-registry

Reaching out for some help here.

ManagedOnlineDeployment vs Kube.netesOnlineDeployment

Goal: Host a large number of distinct models on Azure ML.

Description: After throughout investigation, I found out that there are two ways to host a pre-trained real-time model (ie, run inference) on Azure ML.

Real-time Endpoints - Managed Online Deployment
Compute Inference cluster - kube.netes-online-endpoints The differences between the two options are detailed here . I want to host a large number of distinct models (ie, endpoints) while having the best price/performance/ease-of-deployment ratio.

Details:

What I tried I have 4 running VMs as a result of my creation of 4 real-time endpoints. Those endpoints use Curated Environments that are provided by Microsoft.

VMs

Real-time endpoints deployed

Issues

When I want to create a custom environment out of a docker file and then use it as a base image for a certain endpoint, it is a long process:

Build Image > Push Image to CR > Create Custom Environment in AzureML > Create and Deploy Endpoint

If something goes wrong, it only shows when I finish the whole pipeline. It just doesn't feel like the correct way of deploying a model. This process is needed when I cannot use one of the curated environments because I need some dependency that cannot be imported using the conda.yml file

For example: RUN apt-get update -y && apt-get install build-essential cmake pkg-config -y RUN python setup.py build_ext --inplace

Although I'm using 1 instance per endpoint (Instance count = 1), each endpoint creates its dedicated VM which will cost me a lot in the long run (ie, when I have lots of endpoints), now it is costing me around 20$ per day.

Note: Each endpoint has a distinct set of dependencies/versions...

Questions:

1- Am I following the best practice? Or do I need to drastically change my deployment strategy (Move from ManagedOnlineDeployment to Kube.netesOnlineDeployment or even another option that I don't know of)? 2- Is there a way to host all the endpoints on a single VM? Rather than creating a VM for each endpoint. To make it affordable. 3- Is there a way to host the endpoints and get charged per transaction?

General recommendations and clarification questions are more than welcome.

Thank you!

1 answers

As part of the requirement, there is a standard procedure to use the batch endpoints creation.

在此处输入图像描述

The batch endpoints will make use for repeated procedure for any number of models which are running in our current environment. We can call the models which are registered (without registering model, we can get artifacts). Get the artifacts and download the model. Deploy it using any procedure like mentioned below.

在此处输入图像描述

Use the custom environment and make it maximum (features) for single batch endpoint. But there is no alternative procedure excluding docker and conda environment to create and use an environment for this approach. If there is any predefined docker configuration available use that or else create it with docker itself. Avoid conda specifications.

在此处输入图像描述

From available options, choose the most useful for the application and create. Then we can deploy.
This procedure will decrease the configuration burden and improve the functionality.

How to use all GPUs in SageMaker real-time inference?

Model deployment to managed online endpoints inside VNet in Azure Machine Learning

real time inference on Spark Streaming

Real-time Face Recognition

Using sklearn DictVectorizer in real-time systems

Head Pose estimation in real-time video

Classify real-time video with TensorFlow

Real-time analysis of event logs with Elasticsearch

Using sklearn DictVectorizer in real-time systems

Features extraction in Real-time prediction in sagemaker

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How to use all GPUs in SageMaker real-time inference? Model deployment to managed online endpoints inside VNet in Azure Machine Learning real time inference on Spark Streaming Real-time Face Recognition Using sklearn DictVectorizer in real-time systems Head Pose estimation in real-time video Classify real-time video with TensorFlow Real-time analysis of event logs with Elasticsearch Using sklearn DictVectorizer in real-time systems Features extraction in Real-time prediction in sagemaker

Related Tags

Azure model deployment (Real-time Endpoints vs Compute Inference Cluster)

Question

Real-time endpoints deployed

1 answers

solution1 0 2022-12-02 09:56:21

solution1
0 2022-12-02 09:56:21