简体   繁体   中英

Pytesseract on AWS Lambda: tesseract is not installed or it's not in your PATH

Lambda AWS: python 3.8

ARN tested:

  • arn:aws:lambda:eu-west-3:770693421928:layer:Klayers-python38-pytesseract:15
  • arn:aws:lambda:eu-west-3:770693421928:layer:Klayers-python38-pytesseract:16
  • arn:aws:lambda:eu-west-3:770693421928:layer:Klayers-python38-pytesseract:17

Lambda AWS: python 3.7

ARN tested:

  • arn:aws:lambda:eu-west-3:113088814899:layer:Klayers-python37-pytesseract:13

lambda_function.py :

import json
import pytesseract
from PIL import Image


def lambda_handler(event, context):
    try:
        body = {
            "text": pytesseract.image_to_string(Image.open('random_text.png')),
        }
    except Exception as e:
        body = str(e)

    response = {
        "statusCode": 200,
        "body": json.dumps(body)
    }


    return response

No error when I import everything, the error come when I try to do an action with pytesseract.image_to_string()

Because import in python 3.7/3.8 with corresponding ARN work, I suppose that the error is really particular and concern pytesseract . But why isn't it handle by the ARN? I saw other post talking about tutorials but I'm always stuck on this precise error: tesseract is not installed or it's not in your PATH

But why isn't it handle by the ARN?

The layer is only for pytesseract wrapper around the actual tesseract binary. It does not come with the tesseract program.

So you have to build tesseract binary yourself for a lambda environment, and bundle it with your lambda function. One way of how to do it is shown here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM