繁体   English   中英

当我使用CREATE_NO_WINDOW运行带有pytesseract的tesseract时,如何隐藏控制台窗口

[英]How to hide the console window when I run tesseract with pytesseract with CREATE_NO_WINDOW

我正在使用tesseract在screengrabs上执行OCR。 我有一个应用程序使用tkinter窗口在我的类的初始化中利用self.after来执行常量图像擦除并更新tkinter窗口中的标签等值。 我搜索了好几天,找不到任何具体的例子,如何在Windows平台上使用Python3.6来使用CREATE_NO_WINDOW,并使用pytesseract调用tesseract。

这与这个问题有关:

当我使用pytesser运行tesseract时,如何隐藏控制台窗口

我只编写了2周的Python编程,并且不明白在上述问题中执行的步骤是什么/如何执行。 我打开了pytesseract.py文件并查看并找到了proc = subprocess.Popen(命令,stderr = subproces.PIPE)行但是当我尝试编辑它时,我得到了一堆我无法弄清楚的错误。

#!/usr/bin/env python

'''
Python-tesseract. For more information: https://github.com/madmaze/pytesseract

'''

try:
    import Image
except ImportError:
    from PIL import Image

import os
import sys
import subprocess
import tempfile
import shlex


# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY
tesseract_cmd = 'tesseract'

__all__ = ['image_to_string']


def run_tesseract(input_filename, output_filename_base, lang=None, boxes=False,
                  config=None):
    '''
    runs the command:
        `tesseract_cmd` `input_filename` `output_filename_base`

    returns the exit status of tesseract, as well as tesseract's stderr output

    '''
    command = [tesseract_cmd, input_filename, output_filename_base]

    if lang is not None:
        command += ['-l', lang]

    if boxes:
        command += ['batch.nochop', 'makebox']

    if config:
        command += shlex.split(config)

    proc = subprocess.Popen(command, stderr=subprocess.PIPE)
    status = proc.wait()
    error_string = proc.stderr.read()
    proc.stderr.close()
    return status, error_string


def cleanup(filename):
    ''' tries to remove the given filename. Ignores non-existent files '''
    try:
        os.remove(filename)
    except OSError:
        pass


def get_errors(error_string):
    '''
    returns all lines in the error_string that start with the string "error"

    '''

    error_string = error_string.decode('utf-8')
    lines = error_string.splitlines()
    error_lines = tuple(line for line in lines if line.find(u'Error') >= 0)
    if len(error_lines) > 0:
        return u'\n'.join(error_lines)
    else:
        return error_string.strip()


def tempnam():
    ''' returns a temporary file-name '''
    tmpfile = tempfile.NamedTemporaryFile(prefix="tess_")
    return tmpfile.name


class TesseractError(Exception):
    def __init__(self, status, message):
        self.status = status
        self.message = message
        self.args = (status, message)


def image_to_string(image, lang=None, boxes=False, config=None):
    '''
    Runs tesseract on the specified image. First, the image is written to disk,
    and then the tesseract command is run on the image. Tesseract's result is
    read, and the temporary files are erased.

    Also supports boxes and config:

    if boxes=True
        "batch.nochop makebox" gets added to the tesseract call

    if config is set, the config gets appended to the command.
        ex: config="-psm 6"
    '''

    if len(image.split()) == 4:
        # In case we have 4 channels, lets discard the Alpha.
        # Kind of a hack, should fix in the future some time.
        r, g, b, a = image.split()
        image = Image.merge("RGB", (r, g, b))

    input_file_name = '%s.bmp' % tempnam()
    output_file_name_base = tempnam()
    if not boxes:
        output_file_name = '%s.txt' % output_file_name_base
    else:
        output_file_name = '%s.box' % output_file_name_base
    try:
        image.save(input_file_name)
        status, error_string = run_tesseract(input_file_name,
                                             output_file_name_base,
                                             lang=lang,
                                             boxes=boxes,
                                             config=config)
        if status:
            errors = get_errors(error_string)
            raise TesseractError(status, errors)
        f = open(output_file_name, 'rb')
        try:
            return f.read().decode('utf-8').strip()
        finally:
            f.close()
    finally:
        cleanup(input_file_name)
        cleanup(output_file_name)


def main():
    if len(sys.argv) == 2:
        filename = sys.argv[1]
        try:
            image = Image.open(filename)
            if len(image.split()) == 4:
                # In case we have 4 channels, lets discard the Alpha.
                # Kind of a hack, should fix in the future some time.
                r, g, b, a = image.split()
                image = Image.merge("RGB", (r, g, b))
        except IOError:
            sys.stderr.write('ERROR: Could not open file "%s"\n' % filename)
            exit(1)
        print(image_to_string(image))
    elif len(sys.argv) == 4 and sys.argv[1] == '-l':
        lang = sys.argv[2]
        filename = sys.argv[3]
        try:
            image = Image.open(filename)
        except IOError:
            sys.stderr.write('ERROR: Could not open file "%s"\n' % filename)
            exit(1)
        print(image_to_string(image, lang=lang))
    else:
        sys.stderr.write('Usage: python pytesseract.py [-l lang] input_file\n')
        exit(2)


if __name__ == '__main__':
    main()

我正在利用的代码类似于类似问题中的示例:

def get_string(img_path):
    # Read image with opencv
    img = cv2.imread(img_path)
    # Convert to gray
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    # Apply dilation and erosion to remove some noise
    kernel = np.ones((1, 1), np.uint8)
    img = cv2.dilate(img, kernel, iterations=1)
    img = cv2.erode(img, kernel, iterations=1)
    # Write image after removed noise
    cv2.imwrite(src_path + "removed_noise.png", img)
    #  Apply threshold to get image with only black and white
    # Write the image after apply opencv to do some ...
    cv2.imwrite(src_path + "thres.png", img)
    # Recognize text with tesseract for python

    result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))

    return result

当它到达以下行时,黑色控制台窗口闪烁不到一秒钟,然后在运行命令时关闭。

result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))

这是控制台窗口的图片:

程序文件(x86)_Tesseract

以下是其他问题的建议:

您目前正在使用IDLE,在这种情况下,如果出现控制台窗口,我认为这并不重要。 如果您计划使用此库开发GUI应用程序,则需要修改pytesser.py中的subprocess.Popen调用以隐藏控制台。 我首先尝试CREATE_NO_WINDOW进程创建标志。 - eryksun

我非常感谢有关如何使用CREATE_NO_WINDOW修改pytesseract.py库文件中的subprocess.Popen调用的任何帮助。 我也不确定pytesseract.py和pytesser.py库文件之间的区别。 我会对另一个问题发表评论,要求澄清,但我不能在这个网站上有更多的声誉。

我做了更多的研究,并决定更多地了解subprocess.Popen:

子流程的文档

我还引用了以下文章:

使用python subprocess.popen ..不能阻止exe停止工作提示

我更改了pytesseract.py中的原始代码行:

proc = subprocess.Popen(command, stderr=subprocess.PIPE)

以下内容:

proc = subprocess.Popen(command, stderr=subprocess.PIPE, creationflags = CREATE_NO_WINDOW)

我运行代码并得到以下错误:

Tkinter回调中的异常回溯(最近一次调用最后一次):
文件“C:\\ Users \\ Steve \\ AppData \\ Local \\ Programs \\ Python \\ Python36-32 \\ lib \\ tkinter__init __。py”,第1699行,在调用返回self.func(* args)文件“C:\\ Users \\ Steve \\ Documents \\ Stocks \\ QuickOrder \\ QuickOrderGUI.py“,第403行,在gather_data update_cash_button()文件”C:\\ Users \\ Steve \\ Documents \\ Stocks \\ QuickOrder \\ QuickOrderGUI.py“,第208行,在update_cash_button currentCash = get_string(src_path + “cash.png”)文件“C:\\ Users \\ Steve \\ Documents \\ Stocks \\ QuickOrder \\ QuickOrderGUI.py”,第150行,在get_string中结果= pytesseract.image_to_string(Image.open(src_path +“thres.png”))文件“C:\\ Users \\ Steve \\ AppData \\ Local \\ Programs \\ Python \\ Python36-32 \\ lib \\ site-packages \\ pytesseract \\ pytesseract.py”,第125行,在image_to_string中配置config = config)文件“C:\\ Users \\ Steve \\ AppData \\ Local \\ Programs \\ Python \\ Python36-32 \\ lib \\ site-packages \\ pytesseract \\ pytesseract.py“,第49行,在run_tesseract proc = subprocess.Popen(命令,stderr = subprocess.PIPE,creationflags = CREATE_NO_WINDOW) NameError:未定义名称“CREATE_NO_WINDOW”

然后我定义了CREATE_NO_WINDOW变量:

#Assignment of the value of CREATE_NO_WINDOW
CREATE_NO_WINDOW = 0x08000000

我从上面链接的文章中得到了0x08000000的值。 添加定义后,我运行了应用程序,我没有得到任何更多的控制台窗口弹出窗口。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM