在 Python 上使用 Tesseract-OCR 的问题

Question

我是编程新手，我正在尝试使用 Tesseract OCR 来读取图像的文本，但我无法让它工作！ 我在我的环境中安装了 tesseract_OCR、pytesseract 和枕头。 有人有提示吗？

输入：

from PIL import Image 

import pytesseract

print( pytesseract.image_to_string( Image.open('phrase.jpg') ) )

输出：

 C:\Anaconda2\envs\ambiente36\python.exe 

 C:/Users/Simone/Desktop/curso_programacao/Ler_imagens/ler_imagens

Traceback (most recent call last):

File "C:\Anaconda2\envs\ambiente36\lib\site- 
packages\pytesseract\pytesseract.py", line 194, in run_and_get_output
run_tesseract(**kwargs)

File "C:\Anaconda2\envs\ambiente36\lib\site- 
packages\pytesseract\pytesseract.py", line 165, in run_tesseract
proc = subprocess.Popen(command, **subprocess_args())

File "C:\Anaconda2\envs\ambiente36\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)

File "C:\Anaconda2\envs\ambiente36\lib\subprocess.py", line 997, in 
_execute_child 
startupinfo)

FileNotFoundError: [WinError 2] O sistema não pode encontrar o arquivo 
especificado

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:/Users/Simone/Desktop/curso_programacao/Ler_imagens/ler_imagens", 
line 6, in <module>
phrase = pytesseract.image_to_string(Image.open('phrase.jpg'))

File "C:\Anaconda2\envs\ambiente36\lib\site- 
packages\pytesseract\pytesseract.py", line 286, in image_to_string
return run_and_get_output(image, 'txt', lang, config, nice)

File "C:\Anaconda2\envs\ambiente36\lib\site- 
packages\pytesseract\pytesseract.py", line 201, in run_and_get_output
raise TesseractNotFoundError()

pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed 
or it's not in your path

Answer 1

在您的环境中配置 tessaract 应该遵循的步骤是您应该遵循的步骤

首先安装python和pip 这里是步骤然后安装枕头，pytesseract在这里

from PIL import Image
from pytesser.pytesser import *

image_file = "FULL/PATH/TO/YOUR/IMAGE/image.png"
im = Image.open(image_file)
text = image_to_string(im)
text = image_file_to_string(image_file)
text = image_file_to_string(image_file, graceful_errors=True)
print "=====output=======\n"
print text

下载pytessaract 的链接，您可以在此处找到完整的示例

Answer 2

似乎 Tesseract 没有正确安装，或者 tesseract 的路径没有指向 tesseract 实际安装的位置。

pytesseract.pytesseract.TesseractNotFoundError: tesseract 未安装或不在您的路径中

我建议您首先按照官方文档检查您的安装。

我最近写了一个非常简单的 Tesseract 指南，但它应该使您能够编写您的第一个 OCR 脚本并清除我在文档中不太清楚时遇到的一些障碍。

如果您想查看它们，我在这里与您分享链接：

Answer 3

您需要使用此处提供的 Windows 安装程序安装 tesseract。 然后你应该将 python 包装器安装为：

pip install pytesseract

然后您还应该在导入pytesseract库后在脚本中设置tesseract路径，如下所示（请不要忘记在您的情况下可能会修改安装路径！）：

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'

注意：它在 Anaconda3 上测试没有任何问题。

在 Python 上使用 Tesseract-OCR 的问题

问题描述

3 个解决方案

解决方案1
1 2018-06-07 05:52:58

解决方案2
0 2018-06-13 12:37:03

解决方案3
0 2020-09-03 20:25:41

在 Python 上使用 Tesseract-OCR 的问题

问题描述

3 个解决方案

解决方案1 1 2018-06-07 05:52:58

解决方案2 0 2018-06-13 12:37:03

解决方案3 0 2020-09-03 20:25:41

解决方案1
1 2018-06-07 05:52:58

解决方案2
0 2018-06-13 12:37:03

解决方案3
0 2020-09-03 20:25:41