如何调试 sagemaker 批量转换中的调用超时错误？

Question

我正在试验 sagemaker，使用此处列表中的容器https://github.com/aws/deep-learning-containers/blob/master/available_images.md来运行我的 model 并覆盖 inference.py 中的 model_fn 和 predict_fn 函数用于加载 model 和预测的文件，如链接所示（ https://github.com/PacktPublishing/Learn-Amazon-SageMaker-second-edition/blob/main/Chapter%2007/huggingface/src/torchserve-predictor.py ） . 我不断收到调用超时错误 =>“模型服务器未在 3600 秒内响应 /invocations 请求”。 我是否在我的 inference.py 代码中遗漏了任何关于添加一些东西来响应 ping/healthcheck 的东西？

file : inference.py

import json
import torch
from transformers import AutoConfig, AutoTokenizer, DistilBertForSequenceClassification

JSON_CONTENT_TYPE = 'application/json'

def model_fn(model_dir):
    config_path = '{}/config.json'.format(model_dir)
    model_path =  '{}/pytorch_model.bin'.format(model_dir)
    config = AutoConfig.from_pretrained(config_path)
   ...

def predict_fn(input_data, model):
    //return predictions
...

Answer 1

问题不在于健康检查。 容器未响应 /invocations 请求，这可能是由于 model 从输入数据中获取预测所需的时间比预期的要长。

如何调试 sagemaker 批量转换中的调用超时错误？

问题描述

1 个解决方案

解决方案1
1 2022-04-30 01:04:19

如何调试 sagemaker 批量转换中的调用超时错误？

问题描述

1 个解决方案

解决方案1 1 2022-04-30 01:04:19

解决方案1
1 2022-04-30 01:04:19