简体   繁体   English

使用 opencv 进行阈值处理后的文本模糊

[英]Text blur after thresholding using opencv

I am doing some transformations to capture text from image using tesseract OCR, but, doing so, my text after applying some threshold effect is blurry, so I need some assistance here, a little help.我正在使用tesseract OCR 进行一些转换以从图像中捕获文本,但是,这样做后,我的文本在应用了一些阈值效果后变得模糊,所以我需要一些帮助,一点帮助。

This is my code:这是我的代码:

import cv2
import pytesseract as pyt
import numpy as np

pyt.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

image = cv2.imread('vacunacion.jpg')
gris = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gris, 90, 255, cv2.THRESH_BINARY_INV)[1]
imagen_detalle = pyt.image_to_string(thresh, lang='eng',config='--psm 6')
print(imagen_detalle)
cv2.imshow('thresh', thresh)
cv2.imwrite('thresh.jpg', thresh)
cv2.waitKey()

My outpout:我的输出:

2
> PLAN NACIONAL DE VACUNACION
CONTRA COVID-19 Hooia>—
| DOSIS APLICADAS: 191.480
CORTE 4:00 P.M. MARZO - 03 - 2021
SE AMDRES ay
“a 7 women {A GUNIRA (1345)
twas jf Ae d &
paunaanma Goer anda 760 — 2) ©? maaan
CARTABENA (3.457) BOLI (2500) — OESAN 018)
SURES) a ‘3 NORTE DE SANTANDER (25131
CORDOBA (2968) wo 6 SAATANDER (5.936)
ANTIONUIA (24.8563 TS - BOVAGA (5883),
CUNDALAMARCA (12.0251 TY ‘ ARAUCA C8)
SALDAS (23151 EN, f PS yionann 2
RAMLDA C320 wey Pee casa 1227
oan oe ts
Touma 5.080) BK eae 8 WETA (2444)
ALLE DEL cqUGA (20,160) ~=% se GUAINLA 592)
“m8 Se aN +S UAE (3023
‘ante ca o eco yqurtsig2an
puna 2635) ae SO BARUETAISES!
a
rum fe ‘AMAZDAAS (10548)
Oa
Fuente: Ministerio de Salud y Protecciin Saciat - Datos procesados 03 de marzo - 2021

and this is the final image这是最终的图像在此处输入图像描述

The number of applied dosis don't appreciate well in the image, because is so blurry, any technique can be applied here?应用剂量的数量在图像中不能很好地欣赏,因为太模糊了,任何技术都可以在这里应用吗?

This is the original Image.这是原始图像。 在此处输入图像描述

Tesseract can use the gradients around text as part of its detection, so I'd suggest you avoid thresholding where possible, as it removes the gradients (anti-aliasing, as mentioned by fmw42 ) from the image. Tesseract 可以使用文本周围的渐变作为其检测的一部分,因此我建议您尽可能避免使用阈值,因为它会从图像中移除渐变(如fmw42所述,抗锯齿)。

Instead here I'd suggest inverting the image after you grayscale it, and then if necessary you can reduce the brightness to make the more grey text a bit blacker, and increase the contrast to make the grey background a bit more white.相反,在这里我建议在对图像进行灰度化后反转图像,然后如有必要,您可以降低亮度以使更多的灰色文本更黑,并增加对比度以使灰色背景更白一些。 If you do need to adjust the brightness and/or contrast I'd suggest using cv2.convertScaleAbs to do so efficiently and avoid integer overflow problems.如果您确实需要调整亮度和/或对比度,我建议使用cv2.convertScaleAbs来有效地做到这一点并避免 integer 溢出问题。

  • my text after applying some threshold effect is blurry应用一些阈值效果后我的文字很模糊

You applied simple-thresholding and did not achieve the desired result.您应用了简单阈值处理并没有达到预期的结果。 The two missing parts are:缺少的两个部分是:

The other approach for thresholding is taking the binary mask and applying some morphological-transformation .阈值化的另一种方法是采用二进制掩码并应用一些morphological-transformation Then you need to display each detected text-region, center the image and apply ocr.然后您需要显示每个检测到的文本区域,将图像居中并应用 ocr。 Using this approach, you can visualize and see where the problem is.使用这种方法,您可以可视化并查看问题所在。

  • 1. Binary Mask 1.二进制掩码

    • 在此处输入图像描述

    • As we can see, the world and the background images were removed from the image.如我们所见,世界和背景图像已从图像中删除。

  • 2. Dilation 2. 扩张

    • 在此处输入图像描述

    • We applied dilation and bitwise_and for detecing the text more accurately.我们应用了膨胀和bitwise_and来更准确地检测文本。

    1. Detecting text regions检测文本区域
    • Tesseract may find same-region multiple times. Tesseract 可能会多次找到相同的区域。 The idea is to find which part of the text is misinterpreted.这个想法是找出文本的哪一部分被误解了。 Here are some examples:这里有些例子:
Region地区 Result结果
在此处输入图像描述 ~ PLAN NACIONAL DE VACUNACION ~ 计划全国性的疫苗接种
CONTRA COVID-19 Fam>— CONTRA COVID-19 家族>—
在此处输入图像描述 DOSIS APLICADAS: 191.480 DOSIS 应用程序:191.480
在此处输入图像描述 CORTE 4:00 PM MARZO - 03 - 2021科尔特下午 4:00 马佐 - 03 - 2021
在此处输入图像描述 YPROVIENGIA —» SANTA MARTA (9501 © — LA GUARA 11345) YPROVIENGIA —» 圣玛尔塔 (9501 © — LA GUARA 11345)
在此处输入图像描述 BARRANGRILLA CS 997) ATLANTICD3 754) —- ° * —-— MAGDKLENACIS87) BARRANGRILLA CS 997) ATLANTICD3 754) —° * —-- MAGDKLENACIS87)
在此处输入图像描述 GARTAGENA [3.457 }BOLIVAR (2.500) —, . GARTAGENA [3.457 }玻利瓦尔 (2.500) —, . CESAR (3.016) CESAR (3.016)

I found the above results (some of them displayed) using english language.我发现上述结果(其中一些显示)使用英语。 The current language is foreign to me.当前的语言对我来说是陌生的。 Therefore you need to configure the tesseract for the language因此,您需要为语言配置 tesseract

Code:代码:


# Load the library
import cv2
import numpy as np
import pytesseract

# Load the image
img = cv2.imread("u1niV.jpg")

# Convert to HSV color-space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# Get binary mask
msk = cv2.inRange(hsv, np.array([0, 0, 181]), np.array([160, 255, 255]))

# Extract features
krn = cv2.getStructuringElement(cv2.MORPH_RECT, (5 ,3))
dlt = cv2.dilate(msk, krn, iterations=5)
res = 255 - cv2.bitwise_and(dlt, msk)

# OCR detection
d = pytesseract.image_to_data(res, output_type=pytesseract.Output.DICT)

# Get ROI part from the detection
n_boxes = len(d['level'])

# For each detected part
for i in range(n_boxes):

    # Get the localized region
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])

    # Draw rectangle to the detected region
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 5)

    # Crop the region
    cropped = res[y:y+h, x:x+w]

    # Center the region
    cropped = cv2.copyMakeBorder(cropped, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=255)

    # OCR the region
    txt = pytesseract.image_to_string(cropped)
    print(txt)

    # Display
    cv2.imshow("cropped", cropped)
    cv2.waitKey(0)

# Display
cv2.imshow("res", res)
cv2.waitKey(0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM