Deleting consecutive RGB values from a numpy array

Question

I initially created a subarray from the initial array for a greyscale image from this: Deleting consecutive numbers from a numpy array and Remove following duplicates in a numpy array

But now I want to do the same for a coloured image and I'm really confused. I've been working on it for days and simply cannot make sense of how I can approach it.

The problem is the squares are different sizes and I want a pixel for each square represented with the same colour.

Coloured image:

Coloured image

My code for greyscale image:

from PIL import Image
import numpy as np

name1 = raw_input("What is the name of the .png file you want to open? ")

filename1 = "%s.png" % name1

img = Image.open(filename1).convert('L')  # convert image to 8-bit grayscale
WIDTH, HEIGHT = img.size

a = list(img.getdata()) # convert image data to a list of integers
# convert that to 2D list (list of lists of integers)
a = np.array ([a[offset:offset+WIDTH] for offset in range(0, WIDTH*HEIGHT, WIDTH)])

print " "
print "Intial array from image:"  #print as array
print " "
print a

rows_mask = np.insert(np.diff(a[:, 0]).astype(np.bool), 0, True)
columns_mask = np.insert(np.diff(a[0]).astype(np.bool), 0, True)
b = a[np.ix_(rows_mask, columns_mask)]

print " "
print "Subarray from Image:"  #print as array
print " "
print b

#img = Image.fromarray(b, mode='L')

print " "
print "Subarray from Image (clearer format):"  #print as array
print " "
for row in b: #print as a table like format
    print(' '.join('{:3}'.format(value) for value in row))

#img.save("chocolate.png")


#print np.mean(b) #finding mean

For example for this image:

Input array example:

From a = list(img.getdata()) , this is the input I get from the image.

[(115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95)]

The numpy input using a = np.array ([a[offset:offset+WIDTH] for offset in range(0, WIDTH*HEIGHT, WIDTH)]) :

[[[115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]]

 [[115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]]

 [[115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]]

 [[115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]]

 [[115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]]

Output desired:

[[[115  45 135] [245 245  35]]
[ 55 235 195] [245 245  95]]]

Answer 1

This should do it:

columns_mask = np.insert(np.any(np.any(np.diff(a, axis=0).astype(np.bool), axis=1), axis=1), 0, True)
rows_mask = np.insert(np.any(np.any(np.diff(a, axis=1).astype(np.bool), axis=0), axis=1), 0, True)
print(a[np.ix_(columns_mask, rows_mask)])

Output:

[[[115  45 135]
  [245 245  35]]

 [[ 55 235 195]
  [245 245  95]]]

Explanation:
Let's take a more representative example:

a = np.array([[[1, 2, 3],
               [1, 2, 3],
               [2, 4, 7],
               [2, 4, 7],
               [2, 4, 7]], 
              [[1, 2, 3],
               [1, 2, 3],
               [2, 4, 7],
               [2, 4, 7],
               [2, 4, 7]], 
              [[1, 2, 3],
               [1, 2, 3],
               [3, 4, 7],
               [3, 4, 7],
               [3, 4, 7]],
              [[1, 2, 3],
               [1, 2, 3],
               [3, 4, 7],
               [3, 4, 7],
               [3, 4, 7]],
              [[6, 4, 3],
               [6, 4, 3],
               [0, 1, 7],
               [0, 1, 7],
               [0, 1, 7]],
              [[6, 4, 3],
               [6, 4, 3],
               [0, 1, 7],
               [0, 1, 7],
               [0, 1, 7]]])

I chose dimensions to be 6x5x3 for easier tracking.

For R, G, and B we will have the following subarrays:

>>> print(a[:,:,0])  # R
[[1 1 2 2 2]
 [1 1 2 2 2]
 [1 1 3 3 3]
 [1 1 3 3 3]
 [6 6 0 0 0]
 [6 6 0 0 0]]
>>> print(a[:,:,1])  # G
[[2 2 4 4 4]
 [2 2 4 4 4]
 [2 2 4 4 4]
 [2 2 4 4 4]
 [4 4 1 1 1]
 [4 4 1 1 1]]
>>> print(a[:,:,2])  # B
[[3 3 7 7 7]
 [3 3 7 7 7]
 [3 3 7 7 7]
 [3 3 7 7 7]
 [3 3 7 7 7]
 [3 3 7 7 7]]

Note that in this example we have 6 blocks of different colors. But for some components I chose values to be the same for the sake of example.
Expected result would be:

# R
[[1 2]
 [1 3]
 [6 0]]
# G
[[2 4]
 [2 4]
 [4 1]]
# B
[[3 7]
 [3 7]
 [3 7]]

First, we calculate diff in order to locate borders between color squares.
For columns:

>>> print(np.diff(a, axis=0))
[[[ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [ 0  0  0]
  [ 1  0  0]
  [ 1  0  0]
  [ 1  0  0]]

 [[ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 5  2  0]
  [ 5  2  0]
  [-3 -3  0]
  [-3 -3  0]
  [-3 -3  0]]

 [[ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]]]

and for rows:

>>> print(np.diff(a, axis=1))
[[[ 0  0  0]
  [ 1  2  4]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [ 1  2  4]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [ 2  2  4]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [ 2  2  4]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [-6 -3  4]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [-6 -3  4]
  [ 0  0  0]
  [ 0  0  0]]]

See carefully where these numbers come from.

Next, we use .astype(np.bool) to convert all non-zero elements to True . We need this to create boolean masks. See NumPy docs for more information on indexing using boolean arrays.
For columns we get:

>>> print(np.diff(a, axis=0).astype(np.bool))
[[[False False False]
  [False False False]
  [False False False]
  [False False False]
  [False False False]]

 [[False False False]
  [False False False]
  [ True False False]
  [ True False False]
  [ True False False]]

 [[False False False]
  [False False False]
  [False False False]
  [False False False]
  [False False False]]

 [[ True  True False]
  [ True  True False]
  [ True  True False]
  [ True  True False]
  [ True  True False]]

 [[False False False]
  [False False False]
  [False False False]
  [False False False]
  [False False False]]]

and for rows:

>>> print(np.diff(a, axis=1).astype(np.bool))
[[[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]

 [[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]

 [[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]

 [[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]

 [[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]

 [[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]]

Now we should apply logical OR operation across rows for the first array, and across columns for the second array. We need to do that in order not to miss consecutive blocks that have the same values as, for example, in the case of R color.

>>> print(np.any(np.diff(a, axis=0).astype(np.bool), axis=1))
[[False False False]
 [ True False False]
 [False False False]
 [ True  True False]
 [False False False]]
>>> print(np.any(np.diff(a, axis=1).astype(np.bool), axis=0))
[[False False False]
 [ True  True  True]
 [False False False]
 [False False False]]

See the following question for details about np.any : How to operate logic operation of all columns of a 2D numpy array .

Now we perform the same operation but for the colors:

>>> print(np.any(np.any(np.diff(a, axis=0).astype(np.bool), axis=1), axis=1))
[False  True False  True False]
>>> print(np.any(np.any(np.diff(a, axis=1).astype(np.bool), axis=0), axis=1))
[False  True False False]

And finally, using np.insert to add True in the beginning of the arrays to take into account the first elements:

>>> print(np.insert(np.any(np.any(np.diff(a, axis=0).astype(np.bool), axis=1), axis=1), 0, True))
[ True False  True False  True False]
>>> print(np.insert(np.any(np.any(np.diff(a, axis=1).astype(np.bool), axis=0), axis=1), 0, True))
[ True False  True False False]

And now use these indices with np.ix_ to get the desired result:

columns_mask = np.insert(np.any(np.any(np.diff(a, axis=0).astype(np.bool), axis=1), axis=1), 0, True)
rows_mask = np.insert(np.any(np.any(np.diff(a, axis=1).astype(np.bool), axis=0), axis=1), 0, True)

>>> print(a[np.ix_(columns_mask, rows_mask)])
[[[1 2 3]
  [2 4 7]]

 [[1 2 3]
  [3 4 7]]

 [[6 4 3]
  [0 1 7]]]

This is it!

We can check if it's correct for separate colors:

>>> print(a[np.ix_(columns_mask, rows_mask)][:, :, 0])  # R
[[1 2]
 [1 3]
 [6 0]]
>>> print(a[np.ix_(columns_mask, rows_mask)][:, :, 1])  # G
[[2 4]
 [2 4]
 [4 1]]
>>> print(a[np.ix_(columns_mask, rows_mask)][:, :, 2])  # B
[[3 7]
 [3 7]
 [3 7]]

Deleting consecutive RGB values from a numpy array

Question

1 answers

solution1
3 ACCPTED 2018-06-26 17:22:22

Deleting consecutive RGB values from a numpy array

Question

1 answers

solution1 3 ACCPTED 2018-06-26 17:22:22

solution1
3 ACCPTED 2018-06-26 17:22:22