简体   繁体   English

从tesseract导入image_to_string时出现Python错误

[英]Python error when importing image_to_string from tesseract

I recently used tesseract OCR with python and I kept getting an error when I was trying to import image_to_string from tesseract. 我最近使用了tesseract OCR和python,当我尝试从tesseract导入image_to_string时,我一直收到错误。

Code causing the problem: 导致问题的代码:

# Perform OCR using tesseract-ocr library
from tesseract import image_to_string
image = Image.open('input-NEAREST.tif')
print image_to_string(image)

Error caused by above code: 上述代码导致的错误:

Traceback (most recent call last):  
file "./captcha.py", line 52, in <module>  
from tesseract import image_to_string  
ImportError: cannot import name image_to_string

I've verified that the tesseract module is installed: 我已经确认安装了tesseract模块:

digital_alchemy@roaming-gnome /home $ pydoc modules | grep 'tesseract'
Hdf5StubImagePlugin _tesseract          gzip                sipconfig
ORBit               cairo               mako                tesseract

I believe that I've grabbed all the required packages but unfortunately I'm just stuck at this point. 我相信我已经抓住了所有必需的套餐,但不幸的是我只是陷入了困境。 It appears that the function is not in the module. 看来该功能不在模块中。

Any help greatly appreciated. 任何帮助非常感谢。

Another possibility that seems to have worked for me is to modify pytesseract so that instead of import Image it has from PIL import Image 似乎对我有用的另一种可能性是修改pytesseract,而不是从PIL导入Image导入Image

Code that works in PyCharm after modifying pytesseract: 修改pytesseract后在PyCharm中有效的代码:

from pytesseract import image_to_string
from PIL import Image

im = Image.open(r'C:\Users\<user>\Downloads\dashboard-test.jpeg')
print(im)

print(image_to_string(im))

Pytesseract I installed via the package management built into PyCharm Pytesseract我通过PyCharm内置的包管理安装

Is your syntax correct for the module you have installed? 您的语法对于已安装的模块是否正确? That image_to_string functions looks like it is from PyTesser per the usage example on this page: https://code.google.com/p/pytesser/ image_to_string功能看起来是每使用例如PyTesser此页上: https://code.google.com/p/pytesser/

Your import looks like it is for python-tesseract which has a more complicated usage example listed: https://code.google.com/p/python-tesseract/ 您的导入类似于python-tesseract,其中列出了一个更复杂的用法示例: https//code.google.com/p/python-tesseract/

For windows followed below steps 对于Windows遵循以下步骤

pip3 install pytesseract 
pip3 install pillow

Installation of tessaract-ocr is also required https://github.com/tesseract-ocr/tesseract/wiki otherwise you will get an error Tessract is not on path 安装tessaract-ocr也是必需的https://github.com/tesseract-ocr/tesseract/wiki否则你会收到一个错误Tessract不在路径上

Python code Python代码

from PIL import Image
from pytesseract import image_to_string

print ( image_to_string(Image.open('test.tif'),lang='eng')  )

what works for me: 什么对我有用:

after I install the pytesseract form tesseract-ocr-setup-3.05.02-20180621.exe I add the line pytesseract.pytesseract.tesseract_cmd="C:\\\\Program Files (x86)\\\\Tesseract-OCR\\\\tesseract.exe" and use the code form the above this is all the code: 在我安装pytesseract表单tesseract-ocr-setup-3.05.02-20180621.exe后,我添加了行pytesseract.pytesseract.tesseract_cmd="C:\\\\Program Files (x86)\\\\Tesseract-OCR\\\\tesseract.exe"并使用上面的代码形式这是所有代码:

import pytesseract
from PIL import Image

pytesseract.pytesseract.tesseract_cmd="C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe"
im=Image.open("C:\\Users\\<user>\\Desktop\\ro\\capt.png")
print(pytesseract.image_to_string(im,lang='eng'))

I am using windows 10 with PyCharm Community Edition 2018.2.3 x64 我使用的是带有PyCharm Community Edition 2018.2.3 x64的Windows 10

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM