Predict batch of images with a SageMaker model

Question

Thanks by advance for your help to solve this issue. I trained a model on Sagemaker. This is a TensorFlow estimator taking images as input, computing high-level features (ie bottlenecks) with InceptionV3, then using a dense layer to predict new classes.

It kinda works: I can train it, serve it, and predict new images ONE AFTER ANOTHER.

Now I'd like to predict a whole batch of images at once, in one unique HTTP call / predict() call. How?

Here is how I do:

from IPython.display import Image
import numpy as np
from keras.preprocessing import image

estimator = TensorFlow(entry_point=..., ...)
estimator.fit(train_data_location)
predictor = estimator.deploy(initial_instance_count=1,
                         instance_type='ml.m4.xlarge')

image_list = [
    'e9bfa679-31bb-464e-9d9f-3bdb0ef9c121.jpeg',  # 131
    'b27880e1-6de8-43cf-a684-bb02aef1e44b.jpeg',  # 170
]
directory = '/path/to/dir/'
images = np.empty((len(image_list), 299, 299, 3), dtype=np.float32)
# for filename in image_list:
for i,filename in enumerate(image_list):
    path = os.path.join(directory, filename)
    Image(path)
    img = image.load_img(path, target_size=(299, 299))
    x = image.img_to_array(img)
    images[i] = x

print(images.shape)
# to_send = images[-1]  # works for a unique image
to_send = images  # doesn't work for a batch of images
# some other attempts that did not work
# to_send = images.tolist()
# to_send = [images[0].tolist(), images[1].tolist()]
print(np.shape(to_send))

predict_response = predictor.predict(to_send)
print('The model predicted the following classes: \n{}'.format(
    predict_response['outputs']['classes']['int64Val']))

This fires the following results:

(2, 299, 299, 3)

(2, 299, 299, 3) # Notice here the shape of what I send. So why does it complain about the shape [1,2,299,299,3] in the logs below ??

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "". See https://eu-west-1.console.aws.amazon.com/cloudwatch/home?region=eu-west-1#logEventViewer:group=/aws/sagemaker/Endpoints/sagemaker-tensorflow-py2-cpu-2018-02-05-16-48-38-496 in account 047562184710 for more information

So here are the logs from AWS:

# [2018-02-06 09:29:20,937] ERROR in serving: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="input must be 4-dimensional[1,2,299,299,3]
# #011 [[Node: ResizeBilinear = ResizeBilinear[T=DT_FLOAT, _output_shapes=[[1,299,299,3]], align_corners=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ExpandDims, ResizeBilinear/size)]]")
# Traceback (most recent call last):
# File "/opt/amazon/lib/python2.7/site-packages/container_support/serving.py", line 161, in _invoke
# self.transformer.transform(content, input_content_type, requested_output_content_type)
# File "/opt/amazon/lib/python2.7/site-packages/tf_container/serve.py", line 255, in transform
# return self.transform_fn(data, content_type, accepts), accepts
# File "/opt/amazon/lib/python2.7/site-packages/tf_container/serve.py", line 180, in f
# prediction = self.predict_fn(input)
# File "/opt/amazon/lib/python2.7/site-packages/tf_container/serve.py", line 195, in predict_fn
# return self.proxy_client.request(data)
# File "/opt/amazon/lib/python2.7/site-packages/tf_container/proxy_client.py", line 51, in request
# return request_fn(data)
# File "/opt/amazon/lib/python2.7/site-packages/tf_container/proxy_client.py", line 79, in predict
# result = stub.Predict(request, self.request_timeout)
# File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 310, in __call__
# self._request_serializer, self._response_deserializer)
# File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 196, in _blocking_unary_unary
# raise _abortion_error(rpc_error_call)
# AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="input must be 4-dimensional[1,2,299,299,3]
# #011 [[Node: ResizeBilinear = ResizeBilinear[T=DT_FLOAT, _output_shapes=[[1,299,299,3]], align_corners=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ExpandDims, ResizeBilinear/size)]]")
# 2018-02-06 09:29:20,937 ERROR - model server - AbortionError(code=StatusCode.INVALID_ARGUMENT, details="input must be 4-dimensional[1,2,299,299,3]
# #011 [[Node: ResizeBilinear = ResizeBilinear[T=DT_FLOAT, _output_shapes=[[1,299,299,3]], align_corners=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ExpandDims, ResizeBilinear/size)]]")
# Traceback (most recent call last):
# File "/opt/amazon/lib/python2.7/site-packages/container_support/serving.py", line 161, in _invoke
# self.transformer.transform(content, input_content_type, requested_output_content_type)
# File "/opt/amazon/lib/python2.7/site-packages/tf_container/serve.py", line 255, in transform
# return self.transform_fn(data, content_type, accepts), accepts
# File "/opt/amazon/lib/python2.7/site-packages/tf_container/serve.py", line 180, in f
# prediction = self.predict_fn(input)
# File "/opt/amazon/lib/python2.7/site-packages/tf_container/serve.py", line 195, in predict_fn
# return self.proxy_client.request(data)
# File "/opt/amazon/lib/python2.7/site-packages/tf_container/proxy_client.py", line 51, in request
# return request_fn(data)
# File "/opt/amazon/lib/python2.7/site-packages/tf_container/proxy_client.py", line 79, in predict
# result = stub.Predict(request, self.request_timeout)
# File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 310, in __call__
# self._request_serializer, self._response_deserializer)
# File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 196, in _blocking_unary_unary
# raise _abortion_error(rpc_error_call)
# AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="input must be 4-dimensional[1,2,299,299,3]
# #011 [[Node: ResizeBilinear = ResizeBilinear[T=DT_FLOAT, _output_shapes=[[1,299,299,3]], align_corners=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ExpandDims, ResizeBilinear/size)]]")
# [2018-02-06 09:29:20,956] ERROR in serving: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="input must be 4-dimensional[1,2,299,299,3]
# #011 [[Node: ResizeBilinear = ResizeBilinear[T=DT_FLOAT, _output_shapes=[[1,299,299,3]], align_corners=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ExpandDims, ResizeBilinear/size)]]")
# 2018-02-06 09:29:20,956 ERROR - model server - AbortionError(code=StatusCode.INVALID_ARGUMENT, details="input must be 4-dimensional[1,2,299,299,3]
# #011 [[Node: ResizeBilinear = ResizeBilinear[T=DT_FLOAT, _output_shapes=[[1,299,299,3]], align_corners=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ExpandDims, ResizeBilinear/size)]]")
# 10.32.0.1 - - [06/Feb/2018:09:29:20 +0000] "POST /invocations HTTP/1.1" 500 0 "-" "AHC/2.0"

BTW, I experience the same problem if I load the images with PIL instead of Keras:

image = Image.open(path)
image_array = np.array(image)

And here is the code on server side:

def serving_input_fn(params):
""" See https://www.tensorflow.org/programmers_guide/
saved_model#using_savedmodel_with_estimators
and
See https://github.com/aws/sagemaker-python-sdk#creating-a-serving_input_fn
and https://docs.aws.amazon.com/sagemaker/latest/dg/
tf-training-inference-code-template.html
"""
# Download InceptionV3 if need be, in order to 
# compute high level features (called bottleneck here),
# which are then fed into the model
model_dir = './pretrained_model/'
maybe_download_and_extract(params['data_url'],
                           dest_directory=model_dir)
model_path = os.path.join(model_dir, params['model_file_name'])
with tf.gfile.FastGFile(model_path, 'rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())
    bottleneck_tensor, resized_input_tensor, input_tensor = (
        tf.import_graph_def(
            graph_def,
            name='',
            input_map=None,
            return_elements=[
                params['bottleneck_tensor_name'],
                params['resized_input_tensor_name'],
                'DecodeJpeg:0',
            ]))
return tf.estimator.export.ServingInputReceiver(bottleneck_tensor, {
    INPUT_TENSOR_NAME: input_tensor
})

Answer 1

SageMaker now supports batch predictions, which would probably be an easier way to get this done. More info at: https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html

In a nutshell:

To perform a batch transform, create a transform job, which includes the following information:

The path to the S3 bucket where you've stored the data to transform.
The compute resources that you want Amazon SageMaker to use for the transform job. Compute resources are ML compute instances that are managed by Amazon SageMaker.
The path to the S3 bucket where you want to store the output of the job.
The name of the model that you want to use in the transform job.

Predict batch of images with a SageMaker model

Question

1 answers

solution1
1 2018-07-23 21:46:49

Predict batch of images with a SageMaker model

Question

1 answers

solution1 1 2018-07-23 21:46:49

solution1
1 2018-07-23 21:46:49