简体   繁体   English

使用tesseract和pytesseract的图像到文本Python 3.6错误

[英]Image to text Python 3.6 error using tesseract and pytesseract

I'm trying to use the image_to_string function from pytesseract but can't get to do that. 我正在尝试使用pytesseract的image_to_string函数,但无法做到这一点。 I've already installed the pytesseract module and the tesseract module but this last one won't seem to work, I have the following code 我已经安装了pytesseract模块和tesseract模块,但是最后一个似乎无法正常工作,我有以下代码

import argparse
import cv2
import os
import time
import sys
from PIL import Image
import pytesseract
A=Image.open("C:/Users/Martin/Python/Python36/Tickets/2.jpg")
pytesseract.image_to_string(A)

When I run this I get thefollowing error message 运行此命令时,出现以下错误消息

Traceback (most recent call last):
  File "C:/Users/Martin/Python/Python36/cosa.py", line 9, in <module>
    pytesseract.image_to_string(A)
  File "C:\Users\Martin\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 193, in image_to_string
    return run_and_get_output(image, 'txt', lang, config, nice)
  File "C:\Users\Martin\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 140, in run_and_get_output
    run_tesseract(**kwargs)
  File "C:\Users\Martin\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 111, in run_tesseract
    proc = subprocess.Popen(command, stderr=subprocess.PIPE)
  File "C:\Users\Martin\Python\Python36\lib\subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "C:\Users\Martin\Python\Python36\lib\subprocess.py", line 997, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] El sistema no puede encontrar el archivo especificado

So I tried to run import tesseract and this shows up 所以我尝试运行import tesseract,这显示了

Traceback (most recent call last):
  File "<pyshell#53>", line 1, in <module>
    import tesseract
  File "C:\Users\Martin\Python\Python36\lib\site-packages\tesseract\__init__.py", line 34
    print 'Creating user config file: {}'.format(_config_file_usr)
                                    ^
SyntaxError: invalid syntax

I guess it's a compatibility problem (I'm using Python 3.6.5 and print is now a function so () is expected) but when i run pip install --upgrade tesseract I get that it's already up to date so I don't know how to make this work. 我猜这是一个兼容性问题(我正在使用Python 3.6.5,现在print是一个函数,所以()可以预期),但是当我运行pip install --upgrade tesseract时,我已经知道它已经是最新的了,所以我不知道知道如何使这项工作。 I'm working with Windows 7 64bits. 我正在使用Windows 7 64位。 Any help greatly appreciated. 任何帮助,不胜感激。

In your system there's no Tesseract installed. 您的系统中没有安装Tesseract。

The package tesseract that you have installed with pip is another Python package which is not correlated to the Tesseract OCR engine. pip一起安装的tesseract软件包是另一个与Tesseract OCR引擎无关的Python软件包

You have to install Tesseract following this instructions. 您有以下安装正方体这个指令。 Then you can use pytesseract 然后你可以使用pytesseract

Not entirely sure if this solves your problem because it's windows and error is not English, but for other googlers, if you encounter 并非完全确定这是否可以解决您的问题,因为它是Windows,错误不是英语,但对于其他Google员工,如果您遇到

pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path

The ocr needs to be installed separately from the python package from pip: ocr需要与pip的python软件包分开安装:

sudo apt install tesseract-ocr

Will install it into your path. 将其安装到您的路径中。

ocr需要与pip的python软件包分开安装:

sudo apt install tesseract-ocr

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM