使用 python 和 opencv 在图像中查找区域

Question

我想在大约 1,5K 图像中找到一个区域，这些图像都采用相似的格式。 它们都是对人的绘画或照片图像的扫描。 它们都具有相同的色卡。 色卡可以放置在图像的任一侧（参见下面的示例图像）。

结果应该是一个图像，只包含人的肖像。

我可以找到与 opencv 模板匹配的色卡：

import cv2
import numpy as np

method = cv2.TM_SQDIFF_NORMED

# Read the images from the file
img_rgb = cv2.imread('./imgs/test_portrait.jpg')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)

template = cv2.imread('./portraet_color_card.png', 0)
w, h = template.shape[::-1]

result = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)

threshold = .97
loc = np.where(result >= threshold)
for pt in zip(*loc[::-1]):
   print("Found:", pt)
   cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,255,255), 2)

cv2.imwrite('result.png',img_rgb)

Output：

Found: (17, 303)
Found: (18, 303)
Found: (17, 304)
Found: (18, 304)

通过坐标和图像尺寸，我能够确定图像是左还是右，并且可以裁剪图像。 结果远非完美，因为边界仍然存在。

有没有更好的方法从图像中提取肖像？ 我更愿意与 python 和 opencv 一起工作，但我愿意接受其他关于如何解决更多图像问题的建议。

样品：

模板：

Answer 1

首先，假设您至少有 15K 图像，因此需要花费宝贵的时间来自动处理（1,5K 可以手动处理）。 我将尝试定义一种高级方法并提供一些 PoC 结果（抱歉，没有代码，我使用自定义 CV 工具/管道）。

正如您提到的卡的背景颜色不同，所以让我们谨慎行事：颜色卡包含一些特定的 colors。 我将使用它们作为初始“键”。 Colors 是独一无二的，所以我可以定义适当的阈值以使我的结果稳定：

两个分段的单元格为我们提供了一种非常简单的验证方法（比较尺寸、相对位置等）。 此时我们可以很容易地找到色卡背景（最好在已识别的色细胞附近进行多次测量）：

正如您所看到的一些噪音，有损压缩伪影会影响结果，但它仍然足够好。 此时，我们可以做额外的测量以找到背景的colors。

让我们先回顾一下简单的案例：结果似乎已经足够好了，所以最终的crop和小的正确性可以很容易地实现：

有些情况不会那么简单：

我建议在验证规则上投入更多时间并手动处理所有棘手的案例，但也可以通过一些额外的时间来解决“常见的棘手问题”。

无论如何，这里是一个简短的总结：

使用密钥 colors 可靠地识别色卡（并进行初步验证）
进行多次测量以查找色卡背景（因此您可以使用较小的阈值）
进行多次测量以定义图像背景
验证策略是必须的，因此手动处理少量剩余物会更容易

PS：白底白字很有趣，但卡齐米尔·马列维奇很久以前就做过了，无需重复：）

Answer 2

此解决方案假设肖像是图像中最大的图案

解决步骤按顺序：

经典图像处理从图像中获取重要特征：

转换为灰度级。
高斯模糊可减少噪点并平滑图像。
边缘检测，在我的例子中使用Canny 。
形态膨胀将特征分为两种主要模式。
最大连接组件检测（归功于旧的SO 答案）
rest 是用来屏蔽最大连通分量。

请注意，此解决方案有一些假设，因此泛化可能并不总是有效。 但我已经用给定的图像测试了这个解决方案。

#!/usr/bin/python3
# -*- coding: utf-8 -*-

import cv2
import numpy as np

class ImgProcessor:
    def __init__(self, path, imName):
        self.path = path
        self.imName = imName
        self.original = cv2.imread(self.path+self.imName)

    def imProcess(self, ksmooth=7, kdilate=3, thlow=50, thigh= 100):
        # Read Image in BGR format
        img_bgr = self.original.copy()
        # Convert Image to Gray
        img_gray= cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
        # Gaussian Filtering for Noise Removal
        gauss = cv2.GaussianBlur(img_gray, (ksmooth, ksmooth), 0)
        # Canny Edge Detection
        edges = cv2.Canny(gauss, thlow, thigh, 10)
        # Morphological Dilation
        # TODO: experiment diferent kernels
        kernel = np.ones((kdilate, kdilate), 'uint8')
        dil = cv2.dilate(edges, kernel)

        return dil
    
    def largestCC(self, imBW):
        # Extract Largest Connected Component
        # Source: https://stackoverflow.com/a/47057324
        image = imBW.astype('uint8')
        nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(image, connectivity=4)
        sizes = stats[:, -1]

        max_label = 1
        max_size = sizes[1]
        for i in range(2, nb_components):
            if sizes[i] > max_size:
                max_label = i
                max_size = sizes[i]

        img2 = np.zeros(output.shape)
        img2[output == max_label] = 255
        return img2
    
    def maskCorners(self, mask, outval=1):
        y0 = np.min(np.nonzero(mask.sum(axis=1))[0])
        y1 = np.max(np.nonzero(mask.sum(axis=1))[0])
        x0 = np.min(np.nonzero(mask.sum(axis=0))[0])
        x1 = np.max(np.nonzero(mask.sum(axis=0))[0])
        output = np.zeros_like(mask)
        output[y0:y1, x0:x1] = outval
        return output

    def extractROI(self):
        im = self.imProcess()
        lgcc = self.largestCC(im)
        lgcc = lgcc.astype(np.uint8)
        roi = self.maskCorners(lgcc)
        # TODO mask BGR with this mask
        exroi = cv2.bitwise_and(self.original, self.original, mask = roi)
        return exroi

    def show_res(self):
        result = self.extractROI()
        cv2.namedWindow("Result", cv2.WINDOW_NORMAL)
        cv2.imshow("Result", result)
        cv2.waitKey(0)

# ==============================================
if __name__ == "__main__":
    # TODO: change the path, and image name to suit your needs
    impr_ = ImgProcessor(path="/home/", imName="img.png")
    res = impr_.show_res()

使用 python 和 opencv 在图像中查找区域

问题描述

2 个解决方案

解决方案1
0 2022-01-08 17:25:06

解决方案2
0 2022-01-09 20:05:14

解决步骤按顺序：

使用 python 和 opencv 在图像中查找区域

问题描述

2 个解决方案

解决方案1 0 2022-01-08 17:25:06

解决方案2 0 2022-01-09 20:05:14

解决步骤按顺序：

解决方案1
0 2022-01-08 17:25:06

解决方案2
0 2022-01-09 20:05:14