[英]Sagemaker: read-only file system: /opt/ml/models/../config.json when invoking endpoint
Trying to create a Multi Model with sagemaker.尝试使用 sagemaker 创建一个 Multi Model。 Doing the following:执行以下操作:
boto_seasson = boto3.session.Session(region_name='us-east-1')
sess = sagemaker.Session(boto_session=boto_seasson)
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker-role')['Role']['Arn']
huggingface_model = HuggingFaceModel(model_data='s3://bucket/path/model.tar.gz',
transformers_version="4.12.3",
pytorch_version="1.9.1",
py_version='py38',
role=role,
sagemaker_session=sess)
mme = MultiDataModel(name='model-name',
model_data_prefix='s3://bucket/path/',
model=huggingface_model,
sagemaker_session=sess)
predictor = mme.deploy(initial_instance_count=1, instance_type="ml.t2.medium")
If I try to predict:如果我尝试预测:
predictor.predict({"inputs": "test"}, target_model="model.tar.gz")
I get the following error:我收到以下错误:
{ModelError}An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "[Errno 30] Read-only file system: \u0027/opt/ml/models/d8379026esds430426d32321a85878f6b/model/config.json\u0027"
}
If I deploy a single model through the huggingfacemodel:如果我通过 huggingface 模型部署单个 model:
huggingface_model = HuggingFaceModel(model_data='s3://bucket/path/model.tar.gz',
transformers_version="4.12.3",
pytorch_version="1.9.1",
py_version='py38',
role=role,
sagemaker_session=sess)
predictor = huggingface_model.deploy(initial_instance_count=1, instance_type="ml.t2.medium")
Then predict
works normally with no error.然后predict
正常工作,没有错误。
So I was wondering what could be the reason that i get 'read-only' om MultiDataModel
deploy?所以我想知道我在MultiDataModel
部署中获得“只读”的原因可能是什么?
thanks in advance.提前致谢。
Hey Mpizos do you have any logs from CloudWatch?您好 Mpizos,您有 CloudWatch 的任何日志吗? Also one thing I noticed for the MultiDataModel you are specifying a specific model.tar.gz as shown in following code.对于 MultiDataModel,我还注意到一件事,您指定了一个特定的 model.tar.gz,如以下代码所示。
huggingface_model = HuggingFaceModel(model_data='s3://bucket/path/model.tar.gz',
transformers_version="4.12.3",
pytorch_version="1.9.1",
py_version='py38',
role=role,
sagemaker_session=sess)
For MME the model data needs to be a bucket/prefix/ or just a bucket/ this should contain the multiple model.tar.gz's for the different models.对于 MME,model 数据需要是一个桶/前缀/或只是一个桶/这应该包含多个 model.tar.gz 用于不同的模型。 Maybe adjust this to have the right path for all the models and let me know if it's resolved your issue.也许调整它以获得所有模型的正确路径,如果它解决了您的问题,请告诉我。 Another option is utilizing Boto3 for MME deployment this is lower level and gives more granularity in any issues please observe the following example: https://github.com/RamVegiraju/SageMaker-Deployment/tree/master/RealTime/Multi-Model-Endpoint/Pre-Trained-Deployment .另一种选择是利用 Boto3 进行 MME 部署,这是较低级别并在任何问题上提供更多粒度请观察以下示例: https://github.com/RamVegiraju/SageMaker-Deployment/tree/master/RealTime/Multi-Model-Endpoint /预训练部署。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.