简体   繁体   English

Tesseract OCR Python错误

[英]Tesseract OCR Python Error

I am trying to install tesseract but if i follow all the steps I still get an error. 我正在尝试安装tesseract但是如果按照所有步骤操作,我仍然会收到错误消息。

Traceback (most recent call last):
  File "C:\Users\julian\Documents\Schikka\tesseract.py", line 3, in <module>
    print(pytesseract.image_to_string(Image.open('images/test.jpg')))
  File "C:\Users\julian\AppData\Local\Programs\Python\Python35\lib\site-packages\pytesseract\pytesseract.py", line 122, in image_to_string
    config=config)
  File "C:\Users\julian\AppData\Local\Programs\Python\Python35\lib\site-packages\pytesseract\pytesseract.py", line 46, in run_tesseract
    proc = subprocess.Popen(command, stderr=subprocess.PIPE)
  File "C:\Users\julian\AppData\Local\Programs\Python\Python35\lib\subprocess.py", line 676, in __init__
    restore_signals, start_new_session)
  File "C:\Users\julian\AppData\Local\Programs\Python\Python35\lib\subprocess.py", line 957, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2]  the system cannot find the file specified. 

I have tried to install it on linux as well and I still get an error code. 我也尝试在Linux上安装它,但仍然收到错误代码。

I currently have python 3.6 我目前有python 3.6

this is the code I tried: 这是我尝试的代码:

from PIL import Image
import pytesseract
print(pytesseract.image_to_string(Image.open('images/test.jpg')))

please help me. 请帮我。

Seems you haven't installed tesseract and/or configured the paths incorrectly. 似乎您尚未安装tesseract和/或未正确配置路径。 You may try below steps. 您可以尝试以下步骤。

  1. Download and install tesseract 4.0 alpha for 64-bit windows from here . 此处下载并安装适用于64位Windows的tesseract 4.0 alpha
  2. Add C:\\Program Files\\Tesseract 4.0.0 to Windows environment variable PATH . C:\\Program Files\\Tesseract 4.0.0到Windows环境变量PATH
  3. Set Windows environment variable TESSDATA_PREFIX to C:\\Program Files\\Tesseract 4.0.0\\tessdata . 将Windows环境变量TESSDATA_PREFIX设置为C:\\Program Files\\Tesseract 4.0.0\\tessdata
  4. Type tesseract -v in command prompt to verify if it works properly. 在命令提示符下键入tesseract -v以验证其是否正常运行。 Suppose the tesseract version information will return. 假设将返回tesseract version信息。

Take note that the paths above is an example. 请注意,上面的路径是一个示例。 You need to set it as to your actual program installation path. 您需要将其设置为您的实际程序安装路径。

Then you may modify your pytesseract command as below to try again. 然后,您可以按如下所示修改pytesseract command ,然后重试。

ocr = pytesseract.image_to_string(image, lang='eng', boxes=False, \
        config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')

Hope this help. 希望对您有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM