No Module named "Fastai" when trying to deploy fastai model on sagemaker

Question

I have trained and built a Fastai(v1) model and exported it as a.pkl file. Now i want to deploy this model for inference in Amazon Sagemaker

Following the Sagemaker documentation for Pytorch model [https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#write-an-inference-script][1]

Steps taken
Folder structure

Sagemaker/
       export.pkl
       code/
           inference.py
           requirement.txt

 
requirement.txt

    spacy==2.3.4
    torch==1.4.0
    torchvision==0.5.0
    fastai==1.0.60
    numpy

Command i used to create the zip file

    cd Sagemaker/
    tar -czvf /tmp/model.tar.gz ./export.pkl ./code

This would generate a model.tar.gz file and i upload it to S3 bucket

To deploy this i used the python sagemaker SDK


    from sagemaker.pytorch import PyTorchModel
        role = "sagemaker-role-arn"
        model_path = "s3 key for the model.tar.gz file that i created above"
        pytorch_model = PyTorchModel(model_data=model_path,role=role,`entry_point='inference.py',framework_version="1.4.0", py_version="py3")
    
        predictor = pytorch_model.deploy(instance_type='ml.c5.large', initial_instance_count=1)

After executing the above code i see that the model is created in sagemaker and deployed but i end up getting an error running the inference


    botocore.errorfactory.ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary with message "No module named 'fastai'
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 110, in transform
        self.validate_and_initialize(model_dir=model_dir)
      File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 157, in validate_and_initialize
        self._validate_user_module_and_set_functions()
      File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 170, in _validate_user_module_and_set_functions
        user_module = importlib.import_module(user_module_name)
      File "/opt/conda/lib/python3.6/importlib/__init__.py", line 126, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 994, in _gcd_import
      File "<frozen importlib._bootstrap>", line 971, in _find_and_load
      File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 678, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/opt/ml/model/code/inference.py", line 2, in <module>
        from fastai.basic_train import load_learner, DatasetType, Path
    ModuleNotFoundError: No module named 'fastai'

Clearly the fastai module doesn't get downloaded what is the cause for this and what am i doing wrong in this case

Answer 1

To troubleshoot such issues, you should check the CloudWatch logs for the endpoint.

You should check the logs first to see if the requirements.txt was found and installed or if there were any dependency errors.

For packaging the model and your inference scripts, it's recommended to have two files:

model.tar.gz which has the model and the model files.
sourcedir.tar.gz and use SageMaker environment variable SAGEMAKER_SUBMIT_DIRECTORY to point to the file location on S3 s3://bucket/prefix/sourcedir.tar.gz . You can point to the file name using SAGEMAKER_PROGRAM to be inference.py .

Note: when you use source_dir in PyTorchModel , the SDK will package the source_dir , upload it to s3 and define SAGEMAKER_SUBMIT_DIRECTORY .

No Module named "Fastai" when trying to deploy fastai model on sagemaker

Question

1 answers

solution1
0 2022-02-08 13:45:33

No Module named "Fastai" when trying to deploy fastai model on sagemaker

Question

1 answers

solution1 0 2022-02-08 13:45:33

solution1
0 2022-02-08 13:45:33