how to debug invocation timeout error in sagemaker batch transform?

Question

I am experimenting with sagemaker, using a container from list here, https://github.com/aws/deep-learning-containers/blob/master/available_images.md to run my model and overwriting model_fn and predict_fn functions in inference.py file for loading model and prediction as shown in link here ( https://github.com/PacktPublishing/Learn-Amazon-SageMaker-second-edition/blob/main/Chapter%2007/huggingface/src/torchserve-predictor.py ). I keep getting invocations timeout error => "Model server did not respond to /invocations request within 3600 seconds". am i missing anything in my inference.py code, as to adding something to response to the ping/healthcheck?

file : inference.py

import json
import torch
from transformers import AutoConfig, AutoTokenizer, DistilBertForSequenceClassification

JSON_CONTENT_TYPE = 'application/json'

def model_fn(model_dir):
    config_path = '{}/config.json'.format(model_dir)
    model_path =  '{}/pytorch_model.bin'.format(model_dir)
    config = AutoConfig.from_pretrained(config_path)
   ...

def predict_fn(input_data, model):
    //return predictions
...

Answer 1

The issue is not with the health checks. It is with the container not responding to the /invocations request and this is can be due to model taking longer time than expected to get predictions from the input data.

how to debug invocation timeout error in sagemaker batch transform?

Question

1 answers

solution1
1 2022-04-30 01:04:19

how to debug invocation timeout error in sagemaker batch transform?

Question

1 answers

solution1 1 2022-04-30 01:04:19

solution1
1 2022-04-30 01:04:19