简体   繁体   English

如何使用 Pandas 数据框从 SageMaker 端点进行预测?

[英]How do I predict from a SageMaker endpoint using a pandas dataframe?

So I'm trying to use the models that I have created using autopilot in SageMaker Studio but I keep getting different errors.所以我尝试使用我在 SageMaker Studio 中使用自动驾驶仪创建的模型,但我不断收到不同的错误。 Ultimately I want it to be simple;最终我希望它很简单; take a dataframe and predict an output using that dataframe (pandas obviously).获取数据帧并使用该数据帧预测输出(显然是熊猫)。 Here's what I have so far followed by the errors that I am getting.这是我到目前为止所遵循的错误。

import sagemaker, boto3, os
bucket = sagemaker.Session().default_bucket()

model = sagemaker.predictor.Predictor('Predict-Low', sagemaker_session=sagemaker.Session())

df = pd.read_csv('s3://sagemaker-studio-xxx/Sagemaker Data Predict Low.csv')

y = df['Low']
del df['Low']

y_hat = model.predict(df)

---------------------------------------------------------------------------
ParamValidationError                      Traceback (most recent call last)
<ipython-input-43-18ff980cf441> in <module>
----> 1 y_hat = model.predict(df)

/opt/conda/lib/python3.7/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model, target_variant, inference_id)
    134             data, initial_args, target_model, target_variant, inference_id
    135         )
--> 136         response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
    137         return self._handle_response(response)
    138 

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    384                     "%s() only accepts keyword arguments." % py_operation_name)
    385             # The "self" in this scope is referring to the BaseClient.
--> 386             return self._make_api_call(operation_name, kwargs)
    387 
    388         _api_call.__name__ = str(py_operation_name)

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    676         }
    677         request_dict = self._convert_to_request_dict(
--> 678             api_params, operation_model, context=request_context)
    679 
    680         service_id = self._service_model.service_id.hyphenize()

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _convert_to_request_dict(self, api_params, operation_model, context)
    724             api_params, operation_model, context)
    725         request_dict = self._serializer.serialize_to_request(
--> 726             api_params, operation_model)
    727         if not self._client_config.inject_host_prefix:
    728             request_dict.pop('host_prefix', None)

/opt/conda/lib/python3.7/site-packages/botocore/validate.py in serialize_to_request(self, parameters, operation_model)
    317                                                     operation_model.input_shape)
    318             if report.has_errors():
--> 319                 raise ParamValidationError(report=report.generate_report())
    320         return self._serializer.serialize_to_request(parameters,
    321                                                      operation_model)

ParamValidationError: Parameter validation failed:
Invalid type for parameter Body

To me it seems like it wants a string of bytes to do the prediction, so that's what I did.对我来说,它似乎需要一串字节来进行预测,所以这就是我所做的。 I converted the dataframe to a string of bytes and still got an error.我将数据帧转换为一串字节,但仍然出错。 Anyone know what I'm doing wrong?有谁知道我做错了什么?

By the way this is all being done in SageMaker Studio.顺便说一下,这一切都是在 SageMaker Studio 中完成的。 Here is the data.这是数据。

     Date         Company   High    Low  Open  Close  Volume  Adj Close  \
0    7/13/2020    LIFE  4.380  3.880  4.21   3.88   62400       3.88   
1    7/14/2020    LIFE  4.210  3.721  3.95   4.16   80800       4.16   
2    7/15/2020    LIFE  4.550  4.053  4.17   4.50  212500       4.50   
3    7/16/2020    LIFE  4.550  4.350  4.40   4.51   44600       4.51   
4    7/17/2020    LIFE  5.170  4.410  4.54   5.09  257700       5.09   
..         ...     ...    ...    ...   ...    ...     ...        ...   
255  7/16/2021    LIFE  4.590  4.440  4.46   4.50  156300       4.50   
256  7/19/2021    LIFE  4.490  4.220  4.36   4.22  211700       4.22   
257  7/20/2021    LIFE  4.546  4.230  4.23   4.47  212500       4.47   
258  7/21/2021    LIFE  4.800  4.369  4.45   4.48  487500       4.48   
259  7/22/2021    LIFE  4.510  4.260  4.44   4.45  235200       4.45   

          Sector                                          Specifics  \
0    Health Care  Biotechnology: Biological Products (No Diagnos...   
1    Health Care  Biotechnology: Biological Products (No Diagnos...   
2    Health Care  Biotechnology: Biological Products (No Diagnos...   
3    Health Care  Biotechnology: Biological Products (No Diagnos...   
4    Health Care  Biotechnology: Biological Products (No Diagnos...   
..           ...                                                ...   
255  Health Care  Biotechnology: Biological Products (No Diagnos...   
256  Health Care  Biotechnology: Biological Products (No Diagnos...   
257  Health Care  Biotechnology: Biological Products (No Diagnos...   
258  Health Care  Biotechnology: Biological Products (No Diagnos...   
259  Health Care  Biotechnology: Biological Products (No Diagnos...   

     Open Difference from Yesterday  Yesterday Open to Low  \
0                              0.00                  0.000   
1                             -0.26                  0.330   
2                              0.22                  0.229   
3                              0.23                  0.117   
4                              0.14                  0.050   
..                              ...                    ...   
255                            0.01                  0.080   
256                           -0.10                  0.020   
257                           -0.13                  0.140   
258                            0.22                  0.000   
259                           -0.01                  0.081   

     Yesterday Open to High  Yesterday Open to Adj Close  
0                     0.000                         0.00  
1                     0.170                        -0.33  
2                     0.260                         0.21  
3                     0.380                         0.33  
4                     0.150                         0.11  
..                      ...                          ...  
255                   0.100                         0.00  
256                   0.130                         0.04  
257                   0.130                        -0.14  
258                   0.316                         0.24  
259                   0.350                         0.03 

So I found out that you need to specify a serializer for your model in order to make predictions.所以我发现你需要为你的模型指定一个序列化器才能进行预测。 Adding this code before model.predict(...) will do it.model.predict(...)之前添加此代码model.predict(...)

from sagemaker.serializers import CSVSerializer
model.serializer = CSVSerializer()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从圣杯应用程序调用sagemaker xgboost端点? - How do you invoke a sagemaker xgboost endpoint from a chalice app? 如何从 Grafana 调用 sagemaker 端点 - How to invoke a sagemaker endpoint from Grafana 如何使用 .predict_on_batch 预测来自 Tensorflow 数据集的多个批次? - How do I predict on more than one batch from a Tensorflow Dataset, using .predict_on_batch? 如何在 AWS sagemaker 上部署预训练的 sklearn model? (端点停留在创建) - How do I deploy a pre trained sklearn model on AWS sagemaker? (Endpoint stuck on creating) 如何使用 RandomForestRegressor 方法在 Python 中使用 scikitlearn、pandas 预测未来结果? - How do I predict future results with scikitlearn, pandas in Python using RandomForestRegressor method? 如何使用 PyTorch model 进行预测? - How do I predict using a PyTorch model? 我如何使用2列Pandas DataFrame从python graphviz复制webgraphviz的结果 - How do I reproduce results from webgraphviz with python graphviz using 2 column pandas dataframe 如何使用 matplotlib 从 Pandas 数据框列中绘制具有不同标签的子图 - How do I plot subplots with different labels from pandas dataframe columns using matplotlib 如何在不创建结果 dataframe 的情况下使用 pandas 从内部连接中获取索引对? - How do I get the index pairs from an inner join using pandas without creating the resulting dataframe? 如何使用来自两行的值在 pandas dataframe 中创建列? - How do I create a column in a pandas dataframe using values from two rows?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM