从numpy数组中删除连续的RGB值

Question

I initially created a subarray from the initial array for a greyscale image from this: Deleting consecutive numbers from a numpy array and Remove following duplicates in a numpy array 最初，我从初始数组中为灰度图像创建了一个子数组：从numpy数组中删除连续的数字，并删除 numpy数组中的以下重复项

But now I want to do the same for a coloured image and I'm really confused. 但是现在我想对彩色图像做同样的事情，我真的很困惑。 I've been working on it for days and simply cannot make sense of how I can approach it. 我已经为它工作了好几天，简直无法理解如何处理它。

The problem is the squares are different sizes and I want a pixel for each square represented with the same colour. 问题是正方形的大小不同，我希望每个正方形用相同的颜色表示一个像素。

Coloured image: 彩色图像：

Coloured image 彩色图像

My code for greyscale image: 我的灰度图像代码：

from PIL import Image
import numpy as np

name1 = raw_input("What is the name of the .png file you want to open? ")

filename1 = "%s.png" % name1

img = Image.open(filename1).convert('L')  # convert image to 8-bit grayscale
WIDTH, HEIGHT = img.size

a = list(img.getdata()) # convert image data to a list of integers
# convert that to 2D list (list of lists of integers)
a = np.array ([a[offset:offset+WIDTH] for offset in range(0, WIDTH*HEIGHT, WIDTH)])

print " "
print "Intial array from image:"  #print as array
print " "
print a

rows_mask = np.insert(np.diff(a[:, 0]).astype(np.bool), 0, True)
columns_mask = np.insert(np.diff(a[0]).astype(np.bool), 0, True)
b = a[np.ix_(rows_mask, columns_mask)]

print " "
print "Subarray from Image:"  #print as array
print " "
print b

#img = Image.fromarray(b, mode='L')

print " "
print "Subarray from Image (clearer format):"  #print as array
print " "
for row in b: #print as a table like format
    print(' '.join('{:3}'.format(value) for value in row))

#img.save("chocolate.png")


#print np.mean(b) #finding mean

For example for this image: 例如，此图像：

Input array example: 输入数组示例：

From a = list(img.getdata()) , this is the input I get from the image. 从a = list(img.getdata()) ，这是我从图像中获得的输入。

[(115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (115, 45, 135), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (245, 245, 35), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (55, 235, 195), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95), (245, 245, 95)]

The numpy input using a = np.array ([a[offset:offset+WIDTH] for offset in range(0, WIDTH*HEIGHT, WIDTH)]) : 使用a = np.array ([a[offset:offset+WIDTH] for offset in range(0, WIDTH*HEIGHT, WIDTH)])的numpy输入a = np.array ([a[offset:offset+WIDTH] for offset in range(0, WIDTH*HEIGHT, WIDTH)]) ：

[[[115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]]

 [[115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]]

 [[115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]]

 [[115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]]

 [[115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [115  45 135]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]
  [245 245  35]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]

 [[ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [ 55 235 195]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]
  [245 245  95]]]

Output desired: 所需的输出：

[[[115  45 135] [245 245  35]]
[ 55 235 195] [245 245  95]]]

Answer 1

This should do it: 应该这样做：

columns_mask = np.insert(np.any(np.any(np.diff(a, axis=0).astype(np.bool), axis=1), axis=1), 0, True)
rows_mask = np.insert(np.any(np.any(np.diff(a, axis=1).astype(np.bool), axis=0), axis=1), 0, True)
print(a[np.ix_(columns_mask, rows_mask)])

Output: 输出：

[[[115  45 135]
  [245 245  35]]

 [[ 55 235 195]
  [245 245  95]]]

Explanation: 说明：
Let's take a more representative example: 让我们举一个更具代表性的例子：

a = np.array([[[1, 2, 3],
               [1, 2, 3],
               [2, 4, 7],
               [2, 4, 7],
               [2, 4, 7]], 
              [[1, 2, 3],
               [1, 2, 3],
               [2, 4, 7],
               [2, 4, 7],
               [2, 4, 7]], 
              [[1, 2, 3],
               [1, 2, 3],
               [3, 4, 7],
               [3, 4, 7],
               [3, 4, 7]],
              [[1, 2, 3],
               [1, 2, 3],
               [3, 4, 7],
               [3, 4, 7],
               [3, 4, 7]],
              [[6, 4, 3],
               [6, 4, 3],
               [0, 1, 7],
               [0, 1, 7],
               [0, 1, 7]],
              [[6, 4, 3],
               [6, 4, 3],
               [0, 1, 7],
               [0, 1, 7],
               [0, 1, 7]]])

I chose dimensions to be 6x5x3 for easier tracking. 我选择尺寸为6x5x3，以便于跟踪。

For R, G, and B we will have the following subarrays: 对于R，G和B，我们将具有以下子数组：

>>> print(a[:,:,0])  # R
[[1 1 2 2 2]
 [1 1 2 2 2]
 [1 1 3 3 3]
 [1 1 3 3 3]
 [6 6 0 0 0]
 [6 6 0 0 0]]
>>> print(a[:,:,1])  # G
[[2 2 4 4 4]
 [2 2 4 4 4]
 [2 2 4 4 4]
 [2 2 4 4 4]
 [4 4 1 1 1]
 [4 4 1 1 1]]
>>> print(a[:,:,2])  # B
[[3 3 7 7 7]
 [3 3 7 7 7]
 [3 3 7 7 7]
 [3 3 7 7 7]
 [3 3 7 7 7]
 [3 3 7 7 7]]

Note that in this example we have 6 blocks of different colors. 请注意，在此示例中，我们有6种不同颜色的块。 But for some components I chose values to be the same for the sake of example. 但是为了示例起见，对于某些组件，我选择了相同的值。
Expected result would be: 预期结果将是：

# R
[[1 2]
 [1 3]
 [6 0]]
# G
[[2 4]
 [2 4]
 [4 1]]
# B
[[3 7]
 [3 7]
 [3 7]]

First, we calculate diff in order to locate borders between color squares. 首先，我们计算diff以定位颜色方块之间的边界。
For columns: 对于列：

>>> print(np.diff(a, axis=0))
[[[ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [ 0  0  0]
  [ 1  0  0]
  [ 1  0  0]
  [ 1  0  0]]

 [[ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 5  2  0]
  [ 5  2  0]
  [-3 -3  0]
  [-3 -3  0]
  [-3 -3  0]]

 [[ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]
  [ 0  0  0]]]

and for rows: 对于行：

>>> print(np.diff(a, axis=1))
[[[ 0  0  0]
  [ 1  2  4]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [ 1  2  4]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [ 2  2  4]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [ 2  2  4]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [-6 -3  4]
  [ 0  0  0]
  [ 0  0  0]]

 [[ 0  0  0]
  [-6 -3  4]
  [ 0  0  0]
  [ 0  0  0]]]

See carefully where these numbers come from. 仔细查看这些数字的来源。

Next, we use .astype(np.bool) to convert all non-zero elements to True . 接下来，我们使用.astype(np.bool)将所有非零元素转换为True 。 We need this to create boolean masks. 我们需要它来创建布尔掩码。 See NumPy docs for more information on indexing using boolean arrays. 有关使用布尔数组建立索引的更多信息，请参见NumPy文档。
For columns we get: 对于列，我们得到：

>>> print(np.diff(a, axis=0).astype(np.bool))
[[[False False False]
  [False False False]
  [False False False]
  [False False False]
  [False False False]]

 [[False False False]
  [False False False]
  [ True False False]
  [ True False False]
  [ True False False]]

 [[False False False]
  [False False False]
  [False False False]
  [False False False]
  [False False False]]

 [[ True  True False]
  [ True  True False]
  [ True  True False]
  [ True  True False]
  [ True  True False]]

 [[False False False]
  [False False False]
  [False False False]
  [False False False]
  [False False False]]]

and for rows: 对于行：

>>> print(np.diff(a, axis=1).astype(np.bool))
[[[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]

 [[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]

 [[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]

 [[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]

 [[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]

 [[False False False]
  [ True  True  True]
  [False False False]
  [False False False]]]

Now we should apply logical OR operation across rows for the first array, and across columns for the second array. 现在，我们应该在第一个数组的行之间和第二个数组的列之间应用逻辑或运算。 We need to do that in order not to miss consecutive blocks that have the same values as, for example, in the case of R color. 我们需要这样做，以免丢失与例如R色具有相同值的连续块。

>>> print(np.any(np.diff(a, axis=0).astype(np.bool), axis=1))
[[False False False]
 [ True False False]
 [False False False]
 [ True  True False]
 [False False False]]
>>> print(np.any(np.diff(a, axis=1).astype(np.bool), axis=0))
[[False False False]
 [ True  True  True]
 [False False False]
 [False False False]]

See the following question for details about np.any : How to operate logic operation of all columns of a 2D numpy array . 有关np.any详细信息，请参见以下问题：如何对2D numpy数组的所有列进行逻辑运算。

Now we perform the same operation but for the colors: 现在，我们对颜色执行相同的操作：

>>> print(np.any(np.any(np.diff(a, axis=0).astype(np.bool), axis=1), axis=1))
[False  True False  True False]
>>> print(np.any(np.any(np.diff(a, axis=1).astype(np.bool), axis=0), axis=1))
[False  True False False]

And finally, using np.insert to add True in the beginning of the arrays to take into account the first elements: 最后，使用np.insert在数组的开头添加True ，以考虑到第一个元素：

>>> print(np.insert(np.any(np.any(np.diff(a, axis=0).astype(np.bool), axis=1), axis=1), 0, True))
[ True False  True False  True False]
>>> print(np.insert(np.any(np.any(np.diff(a, axis=1).astype(np.bool), axis=0), axis=1), 0, True))
[ True False  True False False]

And now use these indices with np.ix_ to get the desired result: 现在，将这些索引与np.ix_以获得所需的结果：

columns_mask = np.insert(np.any(np.any(np.diff(a, axis=0).astype(np.bool), axis=1), axis=1), 0, True)
rows_mask = np.insert(np.any(np.any(np.diff(a, axis=1).astype(np.bool), axis=0), axis=1), 0, True)

>>> print(a[np.ix_(columns_mask, rows_mask)])
[[[1 2 3]
  [2 4 7]]

 [[1 2 3]
  [3 4 7]]

 [[6 4 3]
  [0 1 7]]]

This is it! 就是这个！

We can check if it's correct for separate colors: 我们可以检查单独的颜色是否正确：

>>> print(a[np.ix_(columns_mask, rows_mask)][:, :, 0])  # R
[[1 2]
 [1 3]
 [6 0]]
>>> print(a[np.ix_(columns_mask, rows_mask)][:, :, 1])  # G
[[2 4]
 [2 4]
 [4 1]]
>>> print(a[np.ix_(columns_mask, rows_mask)][:, :, 2])  # B
[[3 7]
 [3 7]
 [3 7]]

从numpy数组中删除连续的RGB值

问题描述

1 个解决方案

解决方案1
3 已采纳 2018-06-26 17:22:22

从numpy数组中删除连续的RGB值

问题描述

1 个解决方案

解决方案1 3 已采纳 2018-06-26 17:22:22

解决方案1
3 已采纳 2018-06-26 17:22:22