简体   繁体   中英

Numpy: Insert arbitrary number of zeros into matrix rows at various indices

Problem

I have a 2D array that contains a series of 0's and 1's which represent values that have been bit-packed. I need to insert an arbitrary number of 0's at arbitrary points in every row in order to pad the bit-packed values a multiple of 8 bits.

I have 3 vectors.

  1. A vector containing indices that I want to insert zeros at
  2. A vector containing the number of zeros that I want to insert at each point from vector 1.
  3. A vector that contains the size of each bit-string I am padding. (Probably don't need this to solve but it could be fun!)

Example

I have a vector that contains indices to insert before: [0 6 14]

and a vector that contains the number of zeroes that I want to insert: [2 0 4]

and a vector that has the size of each bitstring I am padding: [6, 8, 4]

The aim is to insert the zeroes into each row of array as such:

[[0 0 0 0 0 1  0 0 0 0 0 0 0 1  0 0 0 1]
 [0 0 0 0 0 1  0 0 0 0 0 0 1 0  0 0 0 1]
 [0 0 0 0 1 0  0 0 0 0 0 0 1 0  0 0 1 0]
 [0 0 0 0 1 1  0 0 0 0 0 1 0 0  0 0 1 1]
 [0 0 0 1 0 0  0 0 0 0 0 1 0 0  0 1 0 0]
 [0 0 0 1 0 1  0 0 0 0 0 1 1 0  0 1 0 1]
 [0 0 0 1 1 0  0 0 0 0 0 1 1 0  0 1 1 0]
 [0 0 0 1 1 1  0 0 0 0 1 0 0 0  0 1 1 1]
 [0 0 1 0 0 0  0 0 0 0 1 0 0 0  1 0 0 0]
 [1 1 0 0 1 0  1 1 1 1 1 1 1 1  1 0 0 1]]

*Spaces added between columns to highlight insertion points.

Becomes:

  | |                               | | | |
  v v                               v v v v
[[0 0 0 0 0 0 0 1  0 0 0 0 0 0 0 1  0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 0 1  0 0 0 0 0 0 1 0  0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 1 0  0 0 0 0 0 0 1 0  0 0 0 0 0 0 1 0]
 [0 0 0 0 0 0 1 1  0 0 0 0 0 1 0 0  0 0 0 0 0 0 1 1]
 [0 0 0 0 0 1 0 0  0 0 0 0 0 1 0 0  0 0 0 0 0 1 0 0]
 [0 0 0 0 0 1 0 1  0 0 0 0 0 1 1 0  0 0 0 0 0 1 0 1]
 [0 0 0 0 0 1 1 0  0 0 0 0 0 1 1 0  0 0 0 0 0 1 1 0]
 [0 0 0 0 0 1 1 1  0 0 0 0 1 0 0 0  0 0 0 0 0 1 1 1]
 [0 0 0 0 1 0 0 0  0 0 0 0 1 0 0 0  0 0 0 0 1 0 0 0]
 [0 0 1 1 0 0 1 0  1 1 1 1 1 1 1 1  0 0 0 0 1 0 0 1]]

*Arrows denote inserted 0's

I am trying the most performant way of doing this. All of the vectors/arrays are numpy arrays. I've looked into using numpy.insert but that doesn't seem do have the ability to insert multiple values at a given index. I've also thought about using numpy.hstack and then flattening, but was unable to yield the result I wanted.

Any help is greatly appreciated!

Formatted the matrix for you (although it might be easier to work with a contrived example):

matrix = nparray([[0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
                  [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1],
                  [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0],
                  [0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1],
                  [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
                  [0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1],
                  [0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0],
                  [0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1],
                  [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0],
                  [1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1]])
indices = np.array([0, 6, 14])
num_zeros = np.array([2, 0, 4])
pad = np.array([6, 8, 4])

You need to allocate a new array to do this operation. Creating zero-filled arrays in numpy is very cheap. So let's start with allocating a zero filled array with our desired output shape:

out_shape = np.array(matrix.shape)
out_shape[1] += num_zeros.sum()
zeros = np.zeros(out_shape, dtype=matrix.dtype)

Now, write matrix to continuous blocks of memory in zeros by using slices:

meta = np.stack([indices, num_zeros])
meta = meta[:, meta[1] != 0] # throw away 0 slices
slices = meta.T.ravel().cumsum()
slices = np.append(cs, zeros.shape[1]) # for convenience

prev = 0
for start, end in zip(slices[1::2], slices[2::2]):
    zeros[:, slice(start,end)] = matrix[:, slice(prev, prev + end-start)]
    prev = end-start

Output in zeros :

[[0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0]
 [0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1]
 [0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0]
 [0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 1]
 [0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0]
 [0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1]
 [0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0]
 [0 0 1 1 0 0 1 0 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 1]]

np.insert does support inserting multiple values at the same index, you just have to provide that index multiple times. So you can obtain your desired result as follows:

indices = np.array([0, 6, 14])
n_zeros = np.array([2, 0, 4])

result = np.insert(matrix,
                   np.repeat(indices, n_zeros),
                   0,
                   axis=1)

My approach would be to create a zero array up front and copy the columns into the correct locations. The indexing is a little hairy with respect to clarity, so there is probably room for improvement there.


data = np.array(
  [[0, 0, 0, 0, 0, 1,  0, 0, 0, 0, 0, 0, 0, 1,  0, 0, 0, 1],
   [0, 0, 0, 0, 0, 1,  0, 0, 0, 0, 0, 0, 1, 0,  0, 0, 0, 1],
   [0, 0, 0, 0, 1, 0,  0, 0, 0, 0, 0, 0, 1, 0,  0, 0, 1, 0],
   [0, 0, 0, 0, 1, 1,  0, 0, 0, 0, 0, 1, 0, 0,  0, 0, 1, 1],
   [0, 0, 0, 1, 0, 0,  0, 0, 0, 0, 0, 1, 0, 0,  0, 1, 0, 0],
   [0, 0, 0, 1, 0, 1,  0, 0, 0, 0, 0, 1, 1, 0,  0, 1, 0, 1],
   [0, 0, 0, 1, 1, 0,  0, 0, 0, 0, 0, 1, 1, 0,  0, 1, 1, 0],
   [0, 0, 0, 1, 1, 1,  0, 0, 0, 0, 1, 0, 0, 0,  0, 1, 1, 1],
   [0, 0, 1, 0, 0, 0,  0, 0, 0, 0, 1, 0, 0, 0,  1, 0, 0, 0],
   [1, 1, 0, 0, 1, 0,  1, 1, 1, 1, 1, 1, 1, 1,  1, 0, 0, 1]])
insert_before = [0, 6, 14]
zero_pads = [0, 2, 4]

res = np.zeros((len(data), 8*len(zero_pads)), dtype=int)  

for i in range(len(zero_pads)):
    res[:, i*8+zero_pads[i]:(i+1)*8] = data[:, insert_before[i]:insert_before[i]+8-zero_pads[i]]


>>> res
array([[0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 1],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1],
       [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1],
       [0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0],
       [0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
       [1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1]])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM