简体   繁体   中英

Azure model deployment (Real-time Endpoints vs Compute Inference Cluster)

Reaching out for some help here.

ManagedOnlineDeployment vs Kube.netesOnlineDeployment

Goal: Host a large number of distinct models on Azure ML.

Description: After throughout investigation, I found out that there are two ways to host a pre-trained real-time model (ie, run inference) on Azure ML.

Details:

What I tried I have 4 running VMs as a result of my creation of 4 real-time endpoints. Those endpoints use Curated Environments that are provided by Microsoft.

VMs

在此处输入图像描述

Real-time endpoints deployed在此处输入图像描述

Issues

  1. When I want to create a custom environment out of a docker file and then use it as a base image for a certain endpoint, it is a long process:

Build Image > Push Image to CR > Create Custom Environment in AzureML > Create and Deploy Endpoint

If something goes wrong, it only shows when I finish the whole pipeline. It just doesn't feel like the correct way of deploying a model. This process is needed when I cannot use one of the curated environments because I need some dependency that cannot be imported using the conda.yml file

For example: RUN apt-get update -y && apt-get install build-essential cmake pkg-config -y RUN python setup.py build_ext --inplace

  1. Although I'm using 1 instance per endpoint (Instance count = 1), each endpoint creates its dedicated VM which will cost me a lot in the long run (ie, when I have lots of endpoints), now it is costing me around 20$ per day.

Note: Each endpoint has a distinct set of dependencies/versions...


Questions:

1- Am I following the best practice? Or do I need to drastically change my deployment strategy (Move from ManagedOnlineDeployment to Kube.netesOnlineDeployment or even another option that I don't know of)? 2- Is there a way to host all the endpoints on a single VM? Rather than creating a VM for each endpoint. To make it affordable. 3- Is there a way to host the endpoints and get charged per transaction?


General recommendations and clarification questions are more than welcome.

Thank you!

  • As part of the requirement, there is a standard procedure to use the batch endpoints creation.

在此处输入图像描述

  • The batch endpoints will make use for repeated procedure for any number of models which are running in our current environment. We can call the models which are registered (without registering model, we can get artifacts). Get the artifacts and download the model. Deploy it using any procedure like mentioned below.

在此处输入图像描述

在此处输入图像描述

  • Use the custom environment and make it maximum (features) for single batch endpoint. But there is no alternative procedure excluding docker and conda environment to create and use an environment for this approach. If there is any predefined docker configuration available use that or else create it with docker itself. Avoid conda specifications.

在此处输入图像描述

在此处输入图像描述

  • From available options, choose the most useful for the application and create. Then we can deploy.

  • This procedure will decrease the configuration burden and improve the functionality.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM