简体   繁体   English

如何使用 Python 和 OpenCV 进行图像分割

[英]How to do image segmentation with Python and OpenCV

I have an image of an invoice.我有一张发票的图像。 I want to split that image into pieces and to get smaller images.我想将该图像分成几块并获得更小的图像。 I tried to do OpenCV Kmeans but as an output i get just one small black window.我试着做 OpenCV Kmeans 但作为 output 我只得到一个黑色的小 window。

This is the code that I have:这是我拥有的代码:

import numpy as np
import cv2

#read the image
img = cv2.imread("image1.jpg")

#reshape the image
img = img.reshape((-1,3))
img = np.float32(img)

#criteria for clustering
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER , 10, 1)

#defining number of clusters and iteration number
nubmer_of_clusters = 6
attempts = 50

#doing the clustering
ret, label, center = cv2.kmeans(img, nubmer_of_clusters, None, criteria, attempts, cv2.KMEANS_RANDOM_CENTERS)

center = np.uint8(center)

res = center[label.flatten()]
res = res.reshape((img.shape))
cv2.imshow("starting_image", res)
cv2.waitKey(2)

This is the example of input image:这是输入图像的示例:

在此处输入图像描述

With red colour are marked parts of the image that I want to extract.红色是我要提取的图像的标记部分。 在此处输入图像描述

I do not know know if i used the right model, or even if i used the right approach.我不知道我是否使用了正确的 model,或者即使我使用了正确的方法。 But I need segments of an image that have text on them.但我需要有文字的图像片段。

I have tried with contours, but Im getting contours of each letter, and I want contours for each segment of text:我尝试过使用轮廓,但我得到了每个字母的轮廓,并且我想要每个文本段的轮廓:

img = cv2.imread("image1.jpg")
img=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

ret, thresh=cv2.threshold(img,127,255,cv2.THRESH_BINARY_INV)
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)

for c in contours:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(img,(x,y),(x+w,y+h),(0,0,255),2)
    cv2.imshow('Bounding rect',img)

The key is to dilate (expand) the contours of the letters to form chunks.关键是扩大(扩展)字母的轮廓以形成块。 Here is how:方法如下:

import cv2
import numpy as np

def process(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_canny = cv2.Canny(img_gray, 0, 0)
    return cv2.dilate(img_canny, np.ones((5, 5)), iterations=20)

def draw_segments(img):
    contours, hierarchies = cv2.findContours(process(img), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    for cnt in contours:
        x, y, w, h = cv2.boundingRect(cnt)
        if w * h > 70000:
            cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 5)

img = cv2.imread("document.jpg")
draw_segments(img)
cv2.imshow("Image", img)
cv2.waitKey(0)

Output: Output:

在此处输入图像描述

Explanation:解释:

  1. Import the necessary libraries:导入必要的库:
import cv2
import numpy as np
  1. Define a function to process the image, see the comments in the code for details:定义一个function对图像进行处理,具体见代码中的注释:
def process(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Convert to grayscale
    img_canny = cv2.Canny(img_gray, 0, 0) # Detect edges with canny edge detector
    return cv2.dilate(img_canny, np.ones((5, 5)), iterations=20) # Dilate edges to convert scattered contours that are close to each others into chunks
  1. Define a function that will take in an image, and utilize the process function defined earlier to process the image, and find its contours.定义一个将接收图像的 function,并利用前面定义的process function 来处理图像,并找到它的轮廓。 It will then loop through each contour, and if the contour's bounding rectangle has an area greater than, for example, 70000 (to eliminate the stay text) , draw the bounding rectangle on the image:然后它将遍历每个轮廓,如果轮廓的边界矩形的面积大于例如 70000 (以消除停留文本) ,则在图像上绘制边界矩形:
def draw_segments(img):
    contours, hierarchies = cv2.findContours(process(img), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    for cnt in contours:
        x, y, w, h = cv2.boundingRect(cnt)
        if w * h > 70000:
            cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 5)
  1. Finally, read in the image, call the draw_segments function and display the image:最后,读入图像,调用draw_segments function并显示图像:
img = cv2.imread("document.jpg")
draw_segments(img)
cv2.imshow("Image", img)
cv2.waitKey(0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM