I deployed a Huggingface Transformer model in SageMaker using MLflow's sagemaker.deploy()
.
When logging the model I used infer_signature(np.array(test_example), loaded_model.predict(test_example))
to infer input and output signatures.
Model is deployed successfully. When trying to query the model I get ModelError
(full traceback below).
To query the model, I am using precisely the same test_example
that I used for infer_signature()
:
test_example = [['This is the subject', 'This is the body']]
The only difference is that when querying the deployed model, I am not wrapping the test example in np.array()
as that is not json
-serializeable.
To query the model I tried two different approaches:
import boto3
SAGEMAKER_REGION = 'us-west-2'
MODEL_NAME = '...'
client = boto3.client("sagemaker-runtime", region_name=SAGEMAKER_REGION)
# Approach 1
client.invoke_endpoint(
EndpointName=MODEL_NAME,
Body=json.dumps(test_example),
ContentType="application/json",
)
# Approach 2
client.invoke_endpoint(
EndpointName=MODEL_NAME,
Body=pd.DataFrame(test_example).to_json(orient="split"),
ContentType="application/json; format=pandas-split",
)
but they result in the same error.
Will be grateful for your suggestions.
Thank you!
Note: I am using Python 3 and all strings are unicode .
---------------------------------------------------------------------------
ModelError Traceback (most recent call last)
<ipython-input-89-d09862a5f494> in <module>
2 EndpointName=MODEL_NAME,
3 Body=test_example,
----> 4 ContentType="application/json; format=pandas-split",
5 )
~/anaconda3/envs/amazonei_tensorflow_p36/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
393 "%s() only accepts keyword arguments." % py_operation_name)
394 # The "self" in this scope is referring to the BaseClient.
--> 395 return self._make_api_call(operation_name, kwargs)
396
397 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/amazonei_tensorflow_p36/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
723 error_code = parsed_response.get("Error", {}).get("Code")
724 error_class = self.exceptions.from_code(error_code)
--> 725 raise error_class(parsed_response, operation_name)
726 else:
727 return parsed_response
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{"error_code": "BAD_REQUEST", "message": "dtype of input object does not match expected dtype <U0"}". See https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logEventViewer:group=/aws/sagemaker/Endpoints/bec-sagemaker-model-test-app in account 543052680787 for more information.
Environment info:
{'channels': ['defaults', 'conda-forge', 'pytorch'],
'dependencies': ['python=3.6.10',
'pip==21.3.1',
'pytorch=1.10.2',
'cudatoolkit=10.2',
{'pip': ['mlflow==1.22.0',
'transformers==4.17.0',
'datasets==1.18.4',
'cloudpickle==1.3.0']}],
'name': 'bert_bec_test_env'}
I encoded the strings into numbers before sending them to the model. Next, I added a code within the model wrapper that decodes numbers back to strings. This workaround worked without issues.
In my understanding this might indicate that there is a problem with MLflow's type checking for strings.
Added an issue here: https://github.com/mlflow/mlflow/issues/5474
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.