简体   繁体   中英

Error: tesseract is not installed or it's not in your PATH

I am new to pytesseract and OCR and I searched on the internet that this are the tools that is used to extract text from images. But, I have no prior knowledge of this tool. Right now, I am having this error: tesseract is not installed or it's not in your PATH. See README file for more information.
I don't know how to resolve this and I tried various solutions that I found on internet, which unfortunately didn't worked.

The error code:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
~/.local/lib/python3.9/site-packages/pytesseract/pytesseract.py in run_tesseract(input_filename, output_filename_base, extension, lang, config, nice, timeout)
    254     try:
--> 255         proc = subprocess.Popen(cmd_args, **subprocess_args())
    256     except OSError as e:

/opt/conda/lib/python3.9/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, user, group, extra_groups, encoding, errors, text, umask)
    950 
--> 951             self._execute_child(args, executable, preexec_fn, close_fds,
    952                                 pass_fds, cwd, env,

/opt/conda/lib/python3.9/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, gid, gids, uid, umask, start_new_session)
   1822                         err_msg = os.strerror(errno_num)
-> 1823                     raise child_exception_type(errno_num, err_msg, err_filename)
   1824                 raise child_exception_type(err_msg)

FileNotFoundError: [Errno 2] No such file or directory: 'tesseract'

During handling of the above exception, another exception occurred:

TesseractNotFoundError                    Traceback (most recent call last)
<ipython-input-7-96e86f1cd397> in <module>
      1 img = cv2.imread("Z++¦hler NSHV KTL-Durchlaufanlage-1.jpg")
----> 2 data = pytesseract.image_to_string(img)
      3 print(data)
      4 # plt.imshow(img)

~/.local/lib/python3.9/site-packages/pytesseract/pytesseract.py in image_to_string(image, lang, config, nice, output_type, timeout)
    407     args = [image, 'txt', lang, config, nice, timeout]
    408 
--> 409     return {
    410         Output.BYTES: lambda: run_and_get_output(*(args + [True])),
    411         Output.DICT: lambda: {'text': run_and_get_output(*args)},

~/.local/lib/python3.9/site-packages/pytesseract/pytesseract.py in <lambda>()
    410         Output.BYTES: lambda: run_and_get_output(*(args + [True])),
    411         Output.DICT: lambda: {'text': run_and_get_output(*args)},
--> 412         Output.STRING: lambda: run_and_get_output(*args),
    413     }[output_type]()
    414 

~/.local/lib/python3.9/site-packages/pytesseract/pytesseract.py in run_and_get_output(image, extension, lang, config, nice, timeout, return_bytes)
    285         }
    286 
--> 287         run_tesseract(**kwargs)
    288         filename = kwargs['output_filename_base'] + extsep + extension
    289         with open(filename, 'rb') as output_file:

~/.local/lib/python3.9/site-packages/pytesseract/pytesseract.py in run_tesseract(input_filename, output_filename_base, extension, lang, config, nice, timeout)
    257         if e.errno != ENOENT:
    258             raise e
--> 259         raise TesseractNotFoundError()
    260 
    261     with timeout_manager(proc, timeout) as error_string:

TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.

Corresponding code:

!pip install tesseract
import pytesseract
import cv2
from PIL import Image
import matplotlib.pyplot as plt
img = cv2.imread("meter.jpg")
data = pytesseract.image_to_string(img)
print(data)
# plt.imshow(img)

Let me first tell you that I am using Jupyterhub. Actually, I made an account on my university's jupyterhub. Additionally, I searched on net where one can use 'cmd' and resolve the problem. If so, then please brief me how to do so or I have to contact the Uni admin to solve this problem. Any help is appreciated!

Possible cause of this error is that you installed pytesseract with pip without installing the binary. If that is the case, you can install it as following:

on linux:

sudo apt update
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev

on windows: download it from here then insert the binary path into your code

pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'

on On Mac:

brew install tesseract

For Windows- in Case of the user have installed it for user only the path will be in the user folder Like: C:\Users\<User.Name>\AppData\Local\Tesseract-OCR\tesseract.exe

using same in code works fine

pytesseract.pytesseract.tesseract_cmd = r'C:\Users\John.Doe\AppData\Local\Tesseract-OCR\tesseract.exe'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM