[英]Improve text reading from image
I am trying to read movie credits from a movie.我正在尝试从电影中读取电影演职员表。 To make a MVP I started with a picture:
为了制作 MVP,我从一张图片开始:
I use this code:我使用这段代码:
print(pytesseract.image_to_string(cv2.imread('frames/frame_144889.jpg')))
I tried different psm but it return an ugly text.我尝试了不同的 psm,但它返回了一个难看的文本。
one Swimmer
Decay
Nurse
Aer
a
ig
coy
Coy
cor
ag
Or
Rr
Sa
Ae
Red
cod
Reng
OED Ty
Ryan Stunt Double
UST
er ey a er
Pm
JESSICA NAPIER
ALEX MALONE
Ey
DAMIEN STROUTHOS
JESSE ROWLES
DARIUS WILLIAMS
beamed
Aya
GEORGE HOUVARDAS
Sih
ata ARS Vara
BES liv4
MIKE DUNCAN
Pe
OV TN Ia
Ale Tate
SUV (aa: ae
SU aa
AIDEN GILLETT
MARK DUNCAN.
I tried with other picture with bigger resolution with better result but I which to be able to enable non HD movie.我尝试使用其他分辨率更高且效果更好的图片,但我能够启用非高清电影。
What could I do to improve the precision of the reading?我可以做些什么来提高读数的精度?
Regards Quentin问候昆汀
I achieve good results very often just following this guideline to improve Tesseract accuracy: Tesseract - Improving the quality of the output仅遵循此指南来提高 Tesseract 的准确性,我经常取得良好的效果: Tesseract - 提高 output 的质量
Important things to do are:重要的事情是:
Here is the code:这是代码:
import cv2
import numpy as np
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
img = cv2.imread('a.jpg')
grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(_, blackWhiteImage) = cv2.threshold(grayImage, 127, 255, cv2.THRESH_BINARY_INV)
blackWhiteImage = cv2.copyMakeBorder(src=blackWhiteImage, top=100, bottom=100, left=50, right=50, borderType=cv2.BORDER_CONSTANT, value=(255,255,255))
data = pytesseract.image_to_data(blackWhiteImage, config="-c tessedit_char_whitelist= ABCDEFGHIJKLMNOabcdefghijklmnopqrstuvwxyz --psm 6")
originalImage = cv2.cvtColor(blackWhiteImage, cv2.COLOR_GRAY2BGR)
text = []
for z, a in enumerate(data.splitlines()):
if z != 0:
a = a.split()
if len(a) == 12:
x, y = int(a[6]), int(a[7])
w, h = int(a[8]), int(a[9])
cv2.rectangle(originalImage, (x, y), (x + w, y + h), (0, 255, 0), 1)
cv2.putText(originalImage, a[11], (x, y - 2), cv2.FONT_HERSHEY_DUPLEX, 0.5, (0, 0, 255), 1)
text.append(a[11]);
print("Text result: \n", text)
cv2.imshow('Image result', originalImage)
cv2.waitKey(0)
And the image with the expected result:以及具有预期结果的图像:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.