Invoke endpoint error - detectron2 on AWS Sagemaker: ValueError: Type [application/x-npy] not support this type yet

Question

I have been following this guide for implementing a Detectron2 model on Sagemaker. It all looks good, both on the training and the batch transform side.

However, I tried to tweak a bit the code to create an Endpoint that can be invoked by sending a payload, and I am having some troubles with it.

At the end of this notebook , after creating the SageMaker model object:

model = PyTorchModel(
    name="d2-sku110k-model",
    model_data=training_job_artifact,
    role=role,
    sagemaker_session=sm_session,
    entry_point="predict_sku110k.py",
    source_dir="container_serving",
    image_uri=serve_image_uri,
    framework_version="1.6.0",
    code_location=f"s3://{bucket}/{prefix_code}",
)

I added the following code:

predictor = model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge')

And I can see that the model has been successfully deployed.

However, when I try to predict an image with:

predictor.predict(input)

I get the following error:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary with message "Type [application/x-npy] not support this type yet Traceback (most recent call last): File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 126, in transform result = self._transform_fn(self._model, input_data, content_type, accept) File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 215, in _default_transform_fn data = self._input_fn(input_data, content_type) File "/opt/ml/model/code/predict_sku110k.py", line 98, in input_fn raise ValueError(err_msg) ValueError: Type [application/x-npy] not support this type yet

I tried a bunch of different input types: a image byte-encoded (created with cv2.imencode('.jpg', cv_img)[1].tobytes()), a numpy array, a BytesIO object (created with io module), a dictionary of the form {'input': image} where image is any of the previous (this is because this format was used by a tensorflow endpoint I created some time ago).

As I think it might be relevant, I also copy paste here the Inference script used as entry point:

"""Code used for sagemaker batch transform jobs"""
from typing import BinaryIO, Mapping
import json
import logging
import sys
from pathlib import Path

import numpy as np
import cv2
import torch

from detectron2.engine import DefaultPredictor
from detectron2.config import CfgNode

##############
# Macros
##############

LOGGER = logging.Logger("InferenceScript", level=logging.INFO)
HANDLER = logging.StreamHandler(sys.stdout)
HANDLER.setFormatter(logging.Formatter("%(levelname)s | %(name)s | %(message)s"))
LOGGER.addHandler(HANDLER)

##########
# Deploy
##########
def _load_from_bytearray(request_body: BinaryIO) -> np.ndarray:
    npimg = np.frombuffer(request_body, np.uint8)
    return cv2.imdecode(npimg, cv2.IMREAD_COLOR)


def model_fn(model_dir: str) -> DefaultPredictor:
    r"""Load trained model

    Parameters
    ----------
    model_dir : str
        S3 location of the model directory

    Returns
    -------
    DefaultPredictor
        PyTorch model created by using Detectron2 API
    """
    path_cfg, path_model = None, None
    for p_file in Path(model_dir).iterdir():
        if p_file.suffix == ".json":
            path_cfg = p_file
        if p_file.suffix == ".pth":
            path_model = p_file

    LOGGER.info(f"Using configuration specified in {path_cfg}")
    LOGGER.info(f"Using model saved at {path_model}")

    if path_model is None:
        err_msg = "Missing model PTH file"
        LOGGER.error(err_msg)
        raise RuntimeError(err_msg)
    if path_cfg is None:
        err_msg = "Missing configuration JSON file"
        LOGGER.error(err_msg)
        raise RuntimeError(err_msg)

    with open(str(path_cfg)) as fid:
        cfg = CfgNode(json.load(fid))

    cfg.MODEL.WEIGHTS = str(path_model)
    cfg.MODEL.DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

    return DefaultPredictor(cfg)


def input_fn(request_body: BinaryIO, request_content_type: str) -> np.ndarray:
    r"""Parse input data

    Parameters
    ----------
    request_body : BinaryIO
        encoded input image
    request_content_type : str
        type of content

    Returns
    -------
    np.ndarray
        input image

    Raises
    ------
    ValueError
        ValueError if the content type is not `application/x-image`
    """
    if request_content_type == "application/x-image":
        np_image = _load_from_bytearray(request_body)
    else:
        err_msg = f"Type [{request_content_type}] not support this type yet"
        LOGGER.error(err_msg)
        raise ValueError(err_msg)
    return np_image


def predict_fn(input_object: np.ndarray, predictor: DefaultPredictor) -> Mapping:
    r"""Run Detectron2 prediction

    Parameters
    ----------
    input_object : np.ndarray
        input image
    predictor : DefaultPredictor
        Detectron2 default predictor (see Detectron2 documentation for details)

    Returns
    -------
    Mapping
        a dictionary that contains: the image shape (`image_height`, `image_width`), the predicted
        bounding boxes in format x1y1x2y2 (`pred_boxes`), the confidence scores (`scores`) and the
        labels associated with the bounding boxes (`pred_boxes`)
    """
    LOGGER.info(f"Prediction on image of shape {input_object.shape}")
    outputs = predictor(input_object)
    fmt_out = {
        "image_height": input_object.shape[0],
        "image_width": input_object.shape[1],
        "pred_boxes": outputs["instances"].pred_boxes.tensor.tolist(),
        "scores": outputs["instances"].scores.tolist(),
        "pred_classes": outputs["instances"].pred_classes.tolist(),
    }
    LOGGER.info(f"Number of detected boxes: {len(fmt_out['pred_boxes'])}")
    return fmt_out


# pylint: disable=unused-argument
def output_fn(predictions, response_content_type):
    r"""Serialize the prediction result into the desired response content type"""
    return json.dumps(predictions)

Can anyone point out what is the correct format for invoking the model (or how to tweak the code to use the endpoint)? I am thinking to change the request_content_type to 'application/json', but I am not sure that it will help much.

Edit: I tried a solution inspired by this SO thread but it did not work for my case.

Answer 1

It's been a while since you asked this so I hope you found a solution already, but for people seeing this in the future...

The error appears to be because you are sending the request with the default content_type (no specified a content type in the request, neither specified a serialiser), but your code is made in a way that will only respond to requests that come with content type "application/x-image"

The default content-type is "application/json"

You have 2 options here, you either amend your code to be able to handle "application/json" content type, or when you invoke the endpoint, you add a content-type header with the right value. You could do this by changing the predict method as below:

instead of:

predictor.predict(input)

try:

predictor.predict(input, initial_args={"ContentType":"application/x-image"})

Invoke endpoint error - detectron2 on AWS Sagemaker: ValueError: Type [application/x-npy] not support this type yet

Question

1 answers

solution1
0 2022-02-25 13:53:15

Invoke endpoint error - detectron2 on AWS Sagemaker: ValueError: Type [application/x-npy] not support this type yet

Question

1 answers

solution1 0 2022-02-25 13:53:15

solution1
0 2022-02-25 13:53:15