如何在opencv中替换裁剪的矩形？

Question

我已经设法用文本裁剪了一个边界框，例如给定这张图片：

我可以准确地说出以下框：

使用此代码：

import re
import shutil

from IPython.display import Image

import requests
import pytesseract, cv2

"""https://www.geeksforgeeks.org/text-detection-and-extraction-using-opencv-and-ocr/"""
# Preprocessing the image starts
# Convert the image to gray scale
img = cv2.imread('img.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Performing OTSU threshold
ret, thresh1 = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)

# Specify structure shape and kernel size.
# Kernel size increases or decreases the area
# of the rectangle to be detected.
# A smaller value like (10, 10) will detect
# each word instead of a sentence.
rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (18, 18))

# Applying dilation on the threshold image
dilation = cv2.dilate(thresh1, rect_kernel, iterations = 1)

# Finding contours
contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL,
                                                 cv2.CHAIN_APPROX_NONE)

# Creating a copy of image
im2 = img.copy()


for cnt in contours:
    x, y, w, h = cv2.boundingRect(cnt)
    # Drawing a rectangle on copied image
    rect = cv2.rectangle(im2, (x, y), (x + w, y + h), (0, 255, 0), 2)
    # Cropping the text block for giving input to OCR
    cropped = im2[y:y + h, x:x + w]
    
cv2.imwrite('image-notxt.png', cropped)
Image(filename='image-notxt.png',  width=200)

第 1 部分：如何替换裁剪框并放回明文框？ 例如得到类似的东西：

我试过了：

    for cnt in contours:
        x, y, w, h = cv2.boundingRect(cnt)
        # Drawing a rectangle on copied image
        rect = cv2.rectangle(im2, (x, y), (x + w, y + h), (0, 255, 0), 2)
        # Cropping the text block for giving input to OCR
        cropped = im2[y:y + h, x:x + w]
        text = pytesseract.image_to_string(cropped).strip('\x0c').strip()
        text = re.sub(' +', ' ', text.replace('\n', ' ')).strip()
        if text:
            # White out the cropped box.
            cropped.fill(255)
            # Create the image with the translation.
            cv2.putText(img=cropped, text="foobar", org=(12, 15), fontFace=cv2.FONT_HERSHEY_TRIPLEX, fontScale=0.3, color=(0, 0, 0),thickness=1)
            cv2.imwrite('image-notxt.png', cropped)
            Image(filename='image-notxt.png',  width=200)

这设法使裁剪框变白并插入如下文本：

第 2 部分：如何创建与裁剪框大小相同的 opencv 文本框矩形？ 例如，给定一个字符串foobar ，如何获得这样的最终图像：

Answer 1

在 Python/OpenCV/Numpy 中，使用 Numpy 将颜色写入区域，格式如下：

img[y:y+h, x:x+w] = color tuple

例如：

img[40:40+45, 40:40+150] = (255,255,255)

其中 x,y,w,h = 40,40,150,45

要添加文本，请参阅https://docs.opencv.org/4.1.1/d6/d6e/group__imgproc__draw.html#ga5126f47f883d730f633d74f07456c576上的 cv2.putText()

Answer 2

第 1 部分：如何替换裁剪框并放回明文框？

裁剪框后，填写：

cropped.fill(255)

那会产生

第 2 部分：如何创建与裁剪框大小相同的 opencv 文本框矩形？

放入文本，它有点细微差别，但首先是步骤：

使用cv2.putText()创建包含文本的图像
但是有很多事情
- 您要输入的文本的长度和字体，以及它们是否适合框
- 将文本放入框中的位置/位置

TL;博士

for i, chunk in enumerate(textwrap.wrap(translation, width=20)):
    cv2.putText(img=cropped, text=chunk, org=(12, 15+i*10), 
         fontFace=cv2.FONT_HERSHEY_TRIPLEX, fontScale=0.3, 
         color=(0, 0, 0),thickness=1)
    im2[y:y + h, x:x + w] = cropped

为了处理文本的长度，我必须使用 Python textwrap库将字符串分解为多个子字符串

然后遍历子字符串，我putText每个子字符串放入cropped的图像中。

最后，将原始图像的部分替换为编辑后的裁剪图像，并将文本放入其中，如im2[y:y + h, x:x + w] = cropped

可以在https://www.kaggle.com/code/alvations/image-translate上找到一个工作示例

如何在opencv中替换裁剪的矩形？

问题描述

2 个解决方案

解决方案1
1 2022-07-22 18:55:26

解决方案2
0 2022-07-22 22:45:06

第 1 部分：如何替换裁剪框并放回明文框？

第 2 部分：如何创建与裁剪框大小相同的 opencv 文本框矩形？

TL;博士

如何在opencv中替换裁剪的矩形？

问题描述

2 个解决方案

解决方案1 1 2022-07-22 18:55:26

解决方案2 0 2022-07-22 22:45:06

第 1 部分：如何替换裁剪框并放回明文框？

第 2 部分：如何创建与裁剪框大小相同的 opencv 文本框矩形？

TL;博士

解决方案1
1 2022-07-22 18:55:26

解决方案2
0 2022-07-22 22:45:06