[英]Sagemaker Model Deployment Error, ClientError: An error occurred (ValidationException) when calling the CreateModel operation
I am trying to deploy a model with AWS Sagemaker using SKlearn, and getting this error:我正在尝试使用 SKlearn 通过 AWS Sagemaker 部署 model,并收到此错误:
---------------------------------------------------------------------------
ClientError Traceback (most recent call last)
<ipython-input-145-29a1d3175b01> in <module>
----> 1 deployment = model.deploy(initial_instance_count=1, instance_type="ml.m4.xlarge")
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/estimator.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, use_compiled_model, wait, model_name, kms_key, data_capture_config, tags, serverless_inference_config, async_inference_config, **kwargs)
1254 kms_key=kms_key,
1255 data_capture_config=data_capture_config,
-> 1256 serverless_inference_config=serverless_inference_config,
1257 async_inference_config=async_inference_config,
1258 )
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, **kwargs)
1001 self._base_name = "-".join((self._base_name, compiled_model_suffix))
1002
-> 1003 self._create_sagemaker_model(
1004 instance_type, accelerator_type, tags, serverless_inference_config
1005 )
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/model.py in _create_sagemaker_model(self, instance_type, accelerator_type, tags, serverless_inference_config)
548 container_def,
549 vpc_config=self.vpc_config,
--> 550 enable_network_isolation=enable_network_isolation,
551 tags=tags,
552 )
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py in create_model(self, name, role, container_defs, vpc_config, enable_network_isolation, primary_container, tags)
2670
2671 try:
-> 2672 self.sagemaker_client.create_model(**create_model_request)
2673 except ClientError as e:
2674 error_code = e.response["Error"]["Code"]
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
413 "%s() only accepts keyword arguments." % py_operation_name)
414 # The "self" in this scope is referring to the BaseClient.
--> 415 return self._make_api_call(operation_name, kwargs)
416
417 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
743 error_code = parsed_response.get("Error", {}).get("Code")
744 error_class = self.exceptions.from_code(error_code)
--> 745 raise error_class(parsed_response, operation_name)
746 else:
747 return parsed_response
ClientError: An error occurred (ValidationException) when calling the CreateModel operation: Could not find model data at s3://sagemaker-us-east-2-978433479050/sagemaker-scikit-learn-2022-04-28-22-33-14-817/output/model.tar.gz.
The code I am running is:我正在运行的代码是:
from sagemaker import Session, get_execution_role
from sagemaker.sklearn.estimator import SKLearn
sagemaker_session = Session()
role = get_execution_role()
train_input = sagemaker_session.upload_data("TSLA.csv")
model = SKLearn(entry_point='lr.py',
train_instance_type='ml.m4.xlarge',
role=role, framework_version='0.231',
sagemaker_session=sagemaker_session)
model.fit({'train': train_input})
deployment = model.deploy(initial_instance_count=1, instance_type="ml.m4.xlarge")
And train_input is: s3://sagemaker-us-east-2-978433479050/data/TSLA.csv而 train_input 是:s3://sagemaker-us-east-2-978433479050/data/TSLA.csv
The training job is completed, but for some reason the model is not deploying.培训工作已完成,但由于某种原因 model 未部署。
Please advise, thank you请指教,谢谢
The logs are indicating that your trained model artifact is not being captured properly.日志表明您训练的 model 工件没有被正确捕获。 Please run
请跑
model.data #estimator that you are training
This will show if your model artifact/data was actually created (model.tar.gz).这将显示您的 model 工件/数据是否实际创建 (model.tar.gz)。
Here is an example of training/deploying a sklearn model: https://github.com/RamVegiraju/SageMaker-Deployment/tree/master/RealTime/Script-Mode/Sklearn/Regression这是训练/部署 sklearn model 的示例: https://github.com/RamVegiraju/SageMaker-Deployment/tree/master/RealTime/Script-Mode/Sklearn/Regression
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.