Extract groups of nonzero values in a numpy array

Question

I'm trying to extract rectangular groups of non-zero values from a numpy array. The array may look like this (but much bigger):

a = np.array([
     [0,0,0,0,0,0,0,0,0,0,0],
     [0,0,0,0,0,1,1,1,1,1,0],
     [0,0,0,0,6,1,1,1,3,1,0],
     [0,0,0,0,0,1,1,1,1,1,0],
     [0,0,0,0,2,2,2,0,1,0,0],
     [0,0,0,0,2,2,0,0,0,0,0],
     [0,0,0,0,0,0,0,0,0,0,0],
     [1,1,1,1,0,0,0,0,0,0,0],
     [1,1,1,1,0,0,0,0,7,2,0],
     [1,1,1,1,0,0,0,0,0,0,0]])

and I want to extract groups/blocks of non zero values bigger than a given size (eg bigger than 3x3), ie the coordinates of the min and max corners of those blocks. In this example, I should get the following:

res = [[(7,0), (10,4)],
       [(1,5), (4,10)]]

so that

In [12]: xmin, ymin = res[0][0]

In [13]: xmax, ymax = res[0][1]

In [14]: a[xmin:xmax, ymin:ymax]
Out[14]:
array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]])

In [15]: xmin, ymin = res[1][0]

In [16]: xmax, ymax = res[1][1]

In [17]: a[xmin:xmax, ymin:ymax]
Out[17]:
array([[1, 1, 1, 1, 1],
       [1, 1, 1, 3, 1],
       [1, 1, 1, 1, 1]])

I've tried looking at each non-zero value of the array and growing a shape of the wanted size from this point until it does contain a zero. It works but it is quite slow. For this example array, it takes aroud 1.17 ms, It takes aroung 18 seconds in real applications (ie 600x1000 arrays), which is way too slow. Is there a numpy or OpenCV function or trick to perform this in a quicker way?

Answer 1

It seems like your Problem is a typical computer vision problem. You are looking for areas which are not background and of a specific shape (rectangle) and size (min 3x3).

We use for this kind of problem a blob analysis .

I dont want to write a specific example because there are lot more features included that may also be interesting for your work. There are many Examples for blob analysis. Here is one that could be a good starting point: https://www.learnopencv.com/blob-detection-using-opencv-python-c/

A short extension to my information: The Example of the website is based on an older version of opencv. The following code is the implementation on newer versions. The neweste version conda provides is OpenCV 3.4.2 at the moment:

# Standard imports
import cv2
import numpy as np;

# Read image
im = cv2.imread("blob.png", cv2.IMREAD_GRAYSCALE)

# Set up the detector with default parameters.
detector = cv2.SimpleBlobDetector_create()

# Detect blobs.
keypoints = detector.detect(im)

# Draw detected blobs as red circles.
# cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS ensures the size of the circle corresponds to the size of blob
im_with_keypoints = cv2.drawKeypoints(im, keypoints, np.array([]), (0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

# Show keypoints
cv2.imshow("Keypoints", im_with_keypoints)
cv2.waitKey(0)

The important change is the creation of the detector.

Answer 2

I think there's a very simple solution to this using morpholical transforms . An opening (an erosion followed by a dilation ) will simply whittle down regions smaller than your desired size (3x3) and then restore the remaining ones. Here's a view of a after converting it to uint8 :

Now I'll apply opening on it:

out = cv2.morphologyEx(a, cv2.MORPH_OPEN, np.ones((3,3), dtype=np.uint8))

Visualizing out :

As you can see, it took just a single line of code to identify the rectangular regions. You can use this output as a bitmask to filter out the original image as well.

a_ = a.copy()
a_[np.logical_not(out.astype('bool'))] = 0

Now a bit more challenging part would be if you need to figure out the corner coordinates of the rectangles. You could break out the big guns and apply contour detection, but I feel as though a simpler connected components analysis should work as well.

from skimage.measure import label
out_ = label(out, connectivity=1)

Now every region in the out_ array is marked with a separate number, from 0 to N_regions-1 (where 0 is the background region). The rest of the work is very simple. You could iterate through each number and do some simple numpy comparison to figure out the coordinates of each numbered region.

We can get the job done even quicker by taking advantage of skimage's regionprops . We'll apply it on out_ , the label image we computed earlier.

from skimage.measure import regionprops

for r in regionprops(out_):
  print('({},{}), ({},{})'.format(*r.bbox))

Out:

(1,5), (4,10)
(7,0), (10,4)

Extract groups of nonzero values in a numpy array

Question

2 answers

solution1
1 2020-07-14 10:39:15

solution2
1 ACCPTED 2020-07-14 14:53:42

Extract groups of nonzero values in a numpy array

Question

2 answers

solution1 1 2020-07-14 10:39:15

solution2 1 ACCPTED 2020-07-14 14:53:42

solution1
1 2020-07-14 10:39:15

solution2
1 ACCPTED 2020-07-14 14:53:42