Pytesseract - 图像上的 OCR 与不同 colors 中的文本

Question

当文本存在于不同的 colors 中时，Pytesseract 无法提取文本。 我尝试使用 opencv 来反转图像，但它不适用于深色文本 colors。

图片： 见附图

import cv2
import pytesseract

from PIL import Image


def text(image):
    image = cv2.resize(image, (0, 0), fx=7, fy=7)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    cv2.imwrite("gray.png", gray)

    blur = cv2.GaussianBlur(gray, (3, 3), 0)
    cv2.imwrite("gray_blur.png", blur)

    thresh = cv2.threshold(blur, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    cv2.imwrite("thresh.png", thresh)

    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
    opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
    cv2.imwrite("opening.png", opening)

    invert = 255 - opening
    cv2.imwrite("invert.png", invert)

    data = pytesseract.image_to_string(invert, lang="eng", config="--psm 7")
    return data

有没有办法从给定的图像中提取文本：DEADLINE(red) 和 WHITE HOUSE(white)

Answer 1

您可以使用ImageOps反转图像。并对图像进行二进制化。

import pytesseract
from PIL import Image,ImageOps
import numpy as np

img = Image.open("OCR.png").convert("L")
img = ImageOps.invert(img)
# img.show()
threshold = 240
table = []
pixelArray = img.load()
for y in range(img.size[1]):  # binaryzate it
    List = []
    for x in range(img.size[0]):
        if pixelArray[x,y] < threshold:
            List.append(0)
        else:
            List.append(255)
    table.append(List)

img = Image.fromarray(np.array(table)) # load the image from array.
# img.show()

print(pytesseract.image_to_string(img))

结果：

最后的img是这样的：

Pytesseract - 图像上的 OCR 与不同 colors 中的文本

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-04-10 07:07:35

Pytesseract - 图像上的 OCR 与不同 colors 中的文本

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-04-10 07:07:35

解决方案1
2 已采纳 2020-04-10 07:07:35