简体   繁体   English

无法从图像中提取文本

[英]Unable to extract text from image

I've been working on a project where use tesseract to extract text from image.I'm also using python 3.7.7 But I'm getting an error which I can't solve.我一直在研究一个使用 tesseract 从图像中提取文本的项目。我也在使用 python 3.7.7 但我遇到了一个我无法解决的错误。

tess.pytesseract.tesseract_cmd = r'C:\\Program Files (x86)\\Tesseract-OCR\\tess1\\eng.traineddata'

img = Image.open('C:\\Users\\USER\\PycharmProjects\\selenium\\automation\\screenshot.png')
text = tess.image_to_string(img, lang='eng')

When I run this I get an error当我运行这个我得到一个错误

Traceback (most recent call last):
  File "C:/Users/USER/PycharmProjects/selenium/automation/open.py", line 8, in <module>
    text = tess.image_to_string(img, lang='eng')
  File "C:\Users\USER\PycharmProjects\selenium\venv\lib\site-packages\pytesseract\pytesseract.py", line 360, in image_to_string
    }[output_type]()
  File "C:\Users\USER\PycharmProjects\selenium\venv\lib\site-packages\pytesseract\pytesseract.py", line 359, in <lambda>
    Output.STRING: lambda: run_and_get_output(*args),
  File "C:\Users\USER\PycharmProjects\selenium\venv\lib\site-packages\pytesseract\pytesseract.py", line 270, in run_and_get_output
    run_tesseract(**kwargs)
  File "C:\Users\USER\PycharmProjects\selenium\venv\lib\site-packages\pytesseract\pytesseract.py", line 241, in run_tesseract
    raise e
  File "C:\Users\USER\PycharmProjects\selenium\venv\lib\site-packages\pytesseract\pytesseract.py", line 238, in run_tesseract
    proc = subprocess.Popen(cmd_args, **subprocess_args())
  File "C:\Python37\lib\subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "C:\Python37\lib\subprocess.py", line 1207, in _execute_child
    startupinfo)
OSError: [WinError 193] %1 is not a valid Win32 application

Please provide a suitable solution请提供合适的解决方案

You need to have Tesseract software installed in the system for you to use pytesseract.您需要在系统中安装 Tesseract 软件才能使用 pytesseract。 pytesseract is just a library that calls the OCR engine Tesseract internally. pytesseract 只是一个在内部调用 OCR 引擎 Tesseract 的库。

Tesseract Installation正方体安装

For Windows适用于 Windows

adding path to path variable将路径添加到路径变量

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM