简体   繁体   English

裁剪多边形并将其转换为灰度

[英]Crop and convert polygons to grayscale

I have a text detector which outputs polygon coordinates of detected text:我有一个文本检测器,它输出检测到的文本的多边形坐标:

样本检测到的文本

I am using below loop to show how the detected text looks like with bounding boxes:我正在使用下面的循环来显示检测到的文本与边界框的外观:

for i in range(0, num_box):
    pts = np.array(boxes[0][i],np.int32)
    pts = pts.reshape((-1,1,2))
    print(pts)
    print('\n')
    img2 = cv2.polylines(img,[pts],True,(0,255,0),2)
return img2

Each pts stores all coordinates of a polygon, for one text box detection:每个pts存储一个多边形的所有坐标,用于一个文本框检测:

pts = 

[[[509 457]]

 [[555 457]]

 [[555 475]]

 [[509 475]]]

I would like to convert the area inside the bounding box described by pts to grayscale using:我想使用以下方法将pts描述的边界框内的区域转换为灰度:

gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

However I am not sure how should I provide the image argument in above gray_image as I want to convert only the area described by pts to grayscale and not the entire image ( img2 ).但是我不确定我应该如何在上面的gray_image中提供image参数,因为我只想将pts描述的区域转换为灰度而不是整个图像( img2 )。 I want the rest of the image to be white.我希望图像的 rest 为白色。

From my understanding you want to convert the content of the bounding box to grayscale, and set the rest of the image to white (background).据我了解,您希望将边界框的内容转换为灰度,并将图像的 rest 设置为白色(背景)。

Here would be my solution to achieve that:这将是我实现这一目标的解决方案:

import cv2
import numpy as np

# Some input image
image = cv2.imread('path/to/your/image.png')

# Some pts 
pts = np.array([[60, 40], [340, 40], [340, 120], [60, 120]])

# Get extreme x, y coordinates from box
x1 = pts[0][0]
y1 = pts[0][1]
x2 = pts[1][0]
y2 = pts[2][1]

# Build output; initialize white background
image2 = 255 * np.ones(image.shape, np.uint8)
image2[y1:y2, x1:x2] = cv2.cvtColor(cv2.cvtColor(image[y1:y2, x1:x2], cv2.COLOR_BGR2GRAY), cv2.COLOR_GRAY2BGR)

# Show bounding box in original image
cv2.polylines(image, [pts], True, (0, 255, 0), 2)

cv2.imshow('image', image)
cv2.imshow('image2', image2)
cv2.waitKey(0)
cv2.destroyAllWindows()

The main "trick" is to use OpenCV's cvtColor method twice just on the region of interest (ROI) of the image, first time converting BGR to grayscale, and then grayscale back to BGR.主要的“技巧”是在图像的感兴趣区域(ROI)上使用两次 OpenCV 的cvtColor方法,第一次将 BGR 转换为灰度,然后将灰度转换回 BGR。 Accessing rectangular ROIs in "Python OpenCV images" is done by proper NumPyarray indexing and slicing .通过适当的 NumPy数组索引和切片来访问“Python OpenCV 图像”中的矩形 ROI。 Operations solely on these ROIs are supported by most OpenCV functions (Python API).大多数 OpenCV 函数 (Python API) 仅支持对这些 ROI 的操作。

EDIT: If your final image is a plain grayscale image, the backwards conversion of course can be omitted!编辑:如果您的最终图像是纯灰度图像,则当然可以省略向后转换!

These are some outputs, I generated with my "standard image":这些是我使用“标准图像”生成的一些输出:

输出 1

输出 2

Hope that helps!希望有帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM