简体   繁体   English

将 tensorflow 模型部署到 Amazon SageMaker 时出现 ValueError

[英]ValueError while deploying tensorflow model to Amazon SageMaker

I want to deploy my trained tensorflow model to the amazon sagemaker, I am following the official guide here: https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/ to deploy my model using jupyter notebook.我想将训练有素的 tensorflow 模型部署到 amazon sagemaker,我在这里遵循官方指南: https ://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using -amazon-sagemaker/使用 jupyter notebook 部署我的模型。

But when I try to use code:但是当我尝试使用代码时:

predictor = sagemaker_model.deploy(initial_instance_count=1, instance_type='ml.t2.medium')

It gives me the following error message:它给了我以下错误消息:

ValueError: Error hosting endpoint sagemaker-tensorflow-2019-08-07-22-57-59-547: Failed Reason: The image '520713654638.dkr.ecr.us-west-1.amazonaws.com/sagemaker-tensorflow:1.12-cpu-py3 ' does not exist. ValueError:托管端点 sagemaker-tensorflow-2019-08-07-22-57-59-547 时出错:失败原因:图像 '520713654638.dkr.ecr.us-west-1.amazonaws.com/sagemaker-tensorflow:1.12 -cpu-py3 ' 不存在。

I think the tutorial does not tell me to create an image, and I do not know what to do.我认为教程没有告诉我创建图像,我不知道该怎么做。

import boto3, re
from sagemaker import get_execution_role

role = get_execution_role()

# make a tar ball of the model data files
import tarfile
with tarfile.open('model.tar.gz', mode='w:gz') as archive:
    archive.add('export', recursive=True)

# create a new s3 bucket and upload the tarball to it
import sagemaker

sagemaker_session = sagemaker.Session()
inputs = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='model')

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '1.12',
                                  entry_point = 'train.py',
                                  py_version='py3')

%%time
#here I fail to deploy the model and get the error message
predictor = sagemaker_model.deploy(initial_instance_count=1,
                                   instance_type='ml.m4.xlarge')

https://github.com/aws/sagemaker-python-sdk/issues/912#issuecomment-510226311 https://github.com/aws/sagemaker-python-sdk/issues/912#issuecomment-510226311

As mentioned in the issue 如问题中所述

Python 3 isn't supported using the TensorFlowModel object, as the container uses the TensorFlow serving api library in conjunction with the GRPC client to handle making inferences, however the TensorFlow serving api isn't supported in Python 3 officially, so there are only Python 2 versions of the containers when using the TensorFlowModel object. 使用TensorFlowModel对象不支持Python 3,因为容器将TensorFlow服务api库与GRPC客户端结合使用来处理推论,但是Python 3正式不支持TensorFlow服务api,因此只有Python使用TensorFlowModel对象时有2个版本的容器。

If you need Python 3 then you will need to use the Model object defined in #2 above. 如果需要Python 3,则需要使用上面#2中定义的Model对象。 The inference script format will change if you need to handle pre and post processing. 如果您需要处理前后处理,则推理脚本格式将更改。 https://github.com/aws/sagemaker-tensorflow-serving-container#prepost-processing . https://github.com/aws/sagemaker-tensorflow-serving-container#prepost-processing

even I am getting this error: Failure reason CannotStartContainerError.即使我收到此错误:失败原因无法启动容器错误。 Please ensure the model container for variant AllTraffic starts correctly when invoked with 'docker run serve'.请确保变体 AllTraffic 的模型容器在使用“docker run serve”调用时正确启动。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Amazon SageMaker中的Tensorflow服务 - Tensorflow Serving in Amazon SageMaker 在 AWS sagemaker 中部署我的 model 时导入 matplotlib 失败 - import matplotlib failed while deploying my model in AWS sagemaker Amazon SageMaker从模型工件进行部署-我们要从档案库加载哪个对象? - Amazon SageMaker deploying from model artifacts - what object do we load from archive? 在 Amazon Sagemaker 上部署随机森林 Model 总是收到 UnexpectedStatusException,原因是:AlgorithmError - Deploying a Random Forest Model on Amazon Sagemaker always getting a UnexpectedStatusException with Reason: AlgorithmError 恢复特定检查点以使用 Sagemaker 和 TensorFlow 进行部署 - Restore a specific checkpoint for deploying with Sagemaker and TensorFlow 在运行模型时使用 tensorflow 进行训练时出现 valueError.Evaluate() - valueError while training with tensorflow while running model.Evaluate() 使用 tensorflow 2 进行模型子类化中的 ValueError - ValueError in model subclassing with tensorflow 2 微调 TensorFlow 对象检测模型时的 ValueError - ValueError while fine-tuning TensorFlow Object Detection model 在Python中部署Tensorflow模型 - deploying the Tensorflow model in Python 如何在亚马逊Sagemaker上部署xgboost模型? - how to deploy a xgboost model on amazon sagemaker?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM