简体   繁体   中英

Python Numpy. Delete an element (or elements) in a 2D array if said element is located between a pair of specified elements

I have a 2D NumPy array exclusively filled with 1s and 0s.

a = [[0 0 0 0 1 0 0 0 1]
     [1 1 1 1 1 1 1 1 1]
     [1 1 1 1 1 1 1 1 1]
     [1 1 1 1 0 0 0 0 1]
     [1 1 1 1 1 1 1 1 1]
     [1 1 1 0 1 1 1 1 1]
     [1 1 1 1 1 1 0 0 1]
     [1 1 1 1 1 1 1 1 1]]

To get the location of the 0s I used the following code:

new_array = np.transpose(np.nonzero(a==0))

As expected, I get the following result showing the location of the 0s within the array

new_array = [[0 0]
             [0 1]
             [0 2]
             [0 3]
             [0 5]
             [0 6]
             [0 7]
             [3 4]
             [3 5]
             [3 6]
             [3 7]
             [5 3]
             [6 6]
             [6 7]]

Now comes my question: Is there way to get the location of the 0s at the start and end of and horizontal group if said group is larger than 2?

EDIT: If group were to finish at the end of a row and continue on the one below it, it would count as 2 separate groups.

My first thought was to implement a process that would delete 0s if they are located in-between 0s but I was not able to figure out how to do that.

I would like "new_array" output to be:

new_array = [[0 0]
             [0 3]
             [0 5]
             [0 7]
             [3 4]
             [3 7]
             [5 3]
             [6 6]
             [6 7]]

Thanks beforehand!!

One possible solution that is easier to follow is:

b = np.diff(a, prepend=1)  # prepend a column of 1s and detect
                           # jumps between adjacent columns (left to right)
y, x = np.where(b > 0)  # find positions of the jumps 0->1 (left to right)
# shift positive jumps to the left by 1 position while filling gaps with 0:
b[y, x - 1] = 1
b[y, x] = 0
new_array = list(zip(*np.where(b)))

Another one is:

new_array = list(zip(*np.where(np.diff(a, n=2, prepend=1, append=1) > 0)))

Both solutions are based on the np.diff that computes differences between consecutive columns (when axis=-1 for 2D arrays).

A flaw in the other solution is that it reports all sequences of zeroes, regardless of their length. Your expected output also contains such groups, composed of 1 or 2 zeroes, but in my opinion it shouldn't.

My solution is free of the above flaw.

An elegant tool to process groups of adjacent equal elements is itertools.groupby , so start from:

import itertools

Then generate your intended result as:

res = []
for rowIdx, row in enumerate(a):
    colIdx = 0  # Start column index
    for k, grp in itertools.groupby(row):
        vals = list(grp)        # Values in the group
        lgth = len(vals)        # Length of the group
        colIdx2 = colIdx + lgth - 1  # End column index
        if k == 0 and lgth > 2: # Record this group
            res.append([rowIdx, colIdx])
            res.append([rowIdx, colIdx2])
        colIdx = colIdx2 + 1    # Advance column index
result = np.array(res)

The result, for your source data, is:

array([[0, 0],
       [0, 3],
       [0, 5],
       [0, 7],
       [3, 4],
       [3, 7]])

As you can see, it doesn't include shorter sequences of zeroes in row 5 and 6.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM