改进从图像中读取文本

Question

I am trying to read movie credits from a movie.我正在尝试从电影中读取电影演职员表。 To make a MVP I started with a picture:为了制作 MVP，我从一张图片开始：

I use this code:我使用这段代码：

print(pytesseract.image_to_string(cv2.imread('frames/frame_144889.jpg')))

I tried different psm but it return an ugly text.我尝试了不同的 psm，但它返回了一个难看的文本。

one Swimmer
Decay
Nurse
Aer
a
ig
coy
Coy
cor
ag
Or
Rr
Sa
Ae
Red
cod
Reng
OED Ty
Ryan Stunt Double
UST
er ey a er
Pm
JESSICA NAPIER
ALEX MALONE
Ey
DAMIEN STROUTHOS
JESSE ROWLES
DARIUS WILLIAMS
beamed
Aya
GEORGE HOUVARDAS
Sih
ata ARS Vara
BES liv4
MIKE DUNCAN
Pe
OV TN Ia
Ale Tate
SUV (aa: ae
SU aa
AIDEN GILLETT
MARK DUNCAN.

I tried with other picture with bigger resolution with better result but I which to be able to enable non HD movie.我尝试使用其他分辨率更高且效果更好的图片，但我能够启用非高清电影。

What could I do to improve the precision of the reading?我可以做些什么来提高读数的精度？

Regards Quentin问候昆汀

Answer 1

I achieve good results very often just following this guideline to improve Tesseract accuracy: Tesseract - Improving the quality of the output仅遵循此指南来提高 Tesseract 的准确性，我经常取得良好的效果： Tesseract - 提高 output 的质量

Important things to do are:重要的事情是：

Use white for the background and black for characters font color.使用白色作为背景，使用黑色作为字符字体颜色。
Select desired tesseractpsm mode. Select 所需的 tesseractpsm 模式。 In this case, use psm mode 6 to treat image as a single uniform block of text.在这种情况下，使用 psm 模式 6 将图像视为单个统一的文本块。
Use tessedit_char_whitelist config to specify only the characters that you are sarching for.使用 tessedit_char_whitelist 配置仅指定您要搜索的字符。 In this case, all minor and major characters of english alphabeth.在这种情况下，英文字母的所有次要和主要字符。

Here is the code:这是代码：

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
img = cv2.imread('a.jpg')
grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(_, blackWhiteImage) = cv2.threshold(grayImage, 127, 255, cv2.THRESH_BINARY_INV)
blackWhiteImage = cv2.copyMakeBorder(src=blackWhiteImage, top=100, bottom=100, left=50, right=50, borderType=cv2.BORDER_CONSTANT, value=(255,255,255))
data = pytesseract.image_to_data(blackWhiteImage, config="-c tessedit_char_whitelist= ABCDEFGHIJKLMNOabcdefghijklmnopqrstuvwxyz --psm 6")
originalImage = cv2.cvtColor(blackWhiteImage, cv2.COLOR_GRAY2BGR)

text = []
for z, a in enumerate(data.splitlines()):
    if z != 0:
        a = a.split()
        if len(a) == 12:
            x, y = int(a[6]), int(a[7])
            w, h = int(a[8]), int(a[9])
            cv2.rectangle(originalImage, (x, y), (x + w, y + h), (0, 255, 0), 1)
            cv2.putText(originalImage, a[11], (x, y - 2), cv2.FONT_HERSHEY_DUPLEX, 0.5, (0, 0, 255), 1)
            text.append(a[11]);

print("Text result: \n", text)
cv2.imshow('Image result', originalImage)
cv2.waitKey(0)

And the image with the expected result:以及具有预期结果的图像：

改进从图像中读取文本

问题描述

1 个解决方案

解决方案1
1 2023-01-31 14:43:47

改进从图像中读取文本

问题描述

1 个解决方案

解决方案1 1 2023-01-31 14:43:47

解决方案1
1 2023-01-31 14:43:47