I have trained and built a Fastai(v1) model and exported it as a.pkl file. Now i want to deploy this model for inference in Amazon Sagemaker
Following the Sagemaker documentation for Pytorch model [https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#write-an-inference-script][1]
Steps taken
Folder structure
Sagemaker/ export.pkl code/ inference.py requirement.txt
requirement.txt spacy==2.3.4 torch==1.4.0 torchvision==0.5.0 fastai==1.0.60 numpy
Command i used to create the zip file
cd Sagemaker/ tar -czvf /tmp/model.tar.gz ./export.pkl ./code
This would generate a model.tar.gz file and i upload it to S3 bucket
To deploy this i used the python sagemaker SDK
from sagemaker.pytorch import PyTorchModel
role = "sagemaker-role-arn"
model_path = "s3 key for the model.tar.gz file that i created above"
pytorch_model = PyTorchModel(model_data=model_path,role=role,`entry_point='inference.py',framework_version="1.4.0", py_version="py3")
predictor = pytorch_model.deploy(instance_type='ml.c5.large', initial_instance_count=1)
After executing the above code i see that the model is created in sagemaker and deployed but i end up getting an error running the inference
botocore.errorfactory.ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary with message "No module named 'fastai'
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 110, in transform
self.validate_and_initialize(model_dir=model_dir)
File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 157, in validate_and_initialize
self._validate_user_module_and_set_functions()
File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 170, in _validate_user_module_and_set_functions
user_module = importlib.import_module(user_module_name)
File "/opt/conda/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 678, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/opt/ml/model/code/inference.py", line 2, in <module>
from fastai.basic_train import load_learner, DatasetType, Path
ModuleNotFoundError: No module named 'fastai'
Clearly the fastai module doesn't get downloaded what is the cause for this and what am i doing wrong in this case
To troubleshoot such issues, you should check the CloudWatch logs for the endpoint.
You should check the logs first to see if the requirements.txt
was found and installed or if there were any dependency errors.
For packaging the model and your inference scripts, it's recommended to have two files:
model.tar.gz
which has the model and the model files. sourcedir.tar.gz
and use SageMaker environment variable SAGEMAKER_SUBMIT_DIRECTORY
to point to the file location on S3 s3://bucket/prefix/sourcedir.tar.gz
. You can point to the file name using SAGEMAKER_PROGRAM
to be inference.py
. Note: when you use source_dir
in PyTorchModel , the SDK will package the source_dir
, upload it to s3 and define SAGEMAKER_SUBMIT_DIRECTORY
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.