[英]How do I predict from a SageMaker endpoint using a pandas dataframe?
So I'm trying to use the models that I have created using autopilot in SageMaker Studio but I keep getting different errors.所以我尝试使用我在 SageMaker Studio 中使用自动驾驶仪创建的模型,但我不断收到不同的错误。 Ultimately I want it to be simple;最终我希望它很简单; take a dataframe and predict an output using that dataframe (pandas obviously).获取数据帧并使用该数据帧预测输出(显然是熊猫)。 Here's what I have so far followed by the errors that I am getting.这是我到目前为止所遵循的错误。
import sagemaker, boto3, os
bucket = sagemaker.Session().default_bucket()
model = sagemaker.predictor.Predictor('Predict-Low', sagemaker_session=sagemaker.Session())
df = pd.read_csv('s3://sagemaker-studio-xxx/Sagemaker Data Predict Low.csv')
y = df['Low']
del df['Low']
y_hat = model.predict(df)
---------------------------------------------------------------------------
ParamValidationError Traceback (most recent call last)
<ipython-input-43-18ff980cf441> in <module>
----> 1 y_hat = model.predict(df)
/opt/conda/lib/python3.7/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model, target_variant, inference_id)
134 data, initial_args, target_model, target_variant, inference_id
135 )
--> 136 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
137 return self._handle_response(response)
138
/opt/conda/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
384 "%s() only accepts keyword arguments." % py_operation_name)
385 # The "self" in this scope is referring to the BaseClient.
--> 386 return self._make_api_call(operation_name, kwargs)
387
388 _api_call.__name__ = str(py_operation_name)
/opt/conda/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
676 }
677 request_dict = self._convert_to_request_dict(
--> 678 api_params, operation_model, context=request_context)
679
680 service_id = self._service_model.service_id.hyphenize()
/opt/conda/lib/python3.7/site-packages/botocore/client.py in _convert_to_request_dict(self, api_params, operation_model, context)
724 api_params, operation_model, context)
725 request_dict = self._serializer.serialize_to_request(
--> 726 api_params, operation_model)
727 if not self._client_config.inject_host_prefix:
728 request_dict.pop('host_prefix', None)
/opt/conda/lib/python3.7/site-packages/botocore/validate.py in serialize_to_request(self, parameters, operation_model)
317 operation_model.input_shape)
318 if report.has_errors():
--> 319 raise ParamValidationError(report=report.generate_report())
320 return self._serializer.serialize_to_request(parameters,
321 operation_model)
ParamValidationError: Parameter validation failed:
Invalid type for parameter Body
To me it seems like it wants a string of bytes to do the prediction, so that's what I did.对我来说,它似乎需要一串字节来进行预测,所以这就是我所做的。 I converted the dataframe to a string of bytes and still got an error.我将数据帧转换为一串字节,但仍然出错。 Anyone know what I'm doing wrong?有谁知道我做错了什么?
By the way this is all being done in SageMaker Studio.顺便说一下,这一切都是在 SageMaker Studio 中完成的。 Here is the data.这是数据。
Date Company High Low Open Close Volume Adj Close \
0 7/13/2020 LIFE 4.380 3.880 4.21 3.88 62400 3.88
1 7/14/2020 LIFE 4.210 3.721 3.95 4.16 80800 4.16
2 7/15/2020 LIFE 4.550 4.053 4.17 4.50 212500 4.50
3 7/16/2020 LIFE 4.550 4.350 4.40 4.51 44600 4.51
4 7/17/2020 LIFE 5.170 4.410 4.54 5.09 257700 5.09
.. ... ... ... ... ... ... ... ...
255 7/16/2021 LIFE 4.590 4.440 4.46 4.50 156300 4.50
256 7/19/2021 LIFE 4.490 4.220 4.36 4.22 211700 4.22
257 7/20/2021 LIFE 4.546 4.230 4.23 4.47 212500 4.47
258 7/21/2021 LIFE 4.800 4.369 4.45 4.48 487500 4.48
259 7/22/2021 LIFE 4.510 4.260 4.44 4.45 235200 4.45
Sector Specifics \
0 Health Care Biotechnology: Biological Products (No Diagnos...
1 Health Care Biotechnology: Biological Products (No Diagnos...
2 Health Care Biotechnology: Biological Products (No Diagnos...
3 Health Care Biotechnology: Biological Products (No Diagnos...
4 Health Care Biotechnology: Biological Products (No Diagnos...
.. ... ...
255 Health Care Biotechnology: Biological Products (No Diagnos...
256 Health Care Biotechnology: Biological Products (No Diagnos...
257 Health Care Biotechnology: Biological Products (No Diagnos...
258 Health Care Biotechnology: Biological Products (No Diagnos...
259 Health Care Biotechnology: Biological Products (No Diagnos...
Open Difference from Yesterday Yesterday Open to Low \
0 0.00 0.000
1 -0.26 0.330
2 0.22 0.229
3 0.23 0.117
4 0.14 0.050
.. ... ...
255 0.01 0.080
256 -0.10 0.020
257 -0.13 0.140
258 0.22 0.000
259 -0.01 0.081
Yesterday Open to High Yesterday Open to Adj Close
0 0.000 0.00
1 0.170 -0.33
2 0.260 0.21
3 0.380 0.33
4 0.150 0.11
.. ... ...
255 0.100 0.00
256 0.130 0.04
257 0.130 -0.14
258 0.316 0.24
259 0.350 0.03
So I found out that you need to specify a serializer for your model in order to make predictions.所以我发现你需要为你的模型指定一个序列化器才能进行预测。 Adding this code before model.predict(...)
will do it.在model.predict(...)
之前添加此代码model.predict(...)
。
from sagemaker.serializers import CSVSerializer
model.serializer = CSVSerializer()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.