简体   繁体   中英

Numpy, sort rows of a matrix putting zeros first and not modifying the rest of the row

I have a matrix in numpy, that is a NxM ndarray that looks like the following one:

[
  [ 0, 5, 11, 22, 0, 0, 11, 22], 
  [ 1, 4, 11, 20, 0, 4, 11, 20], 
  [ 1, 6, 11, 22, 0, 1, 11, 22], 
  [ 4, 7, 12, 21, 0, 4, 12, 21], 
  [ 5, 7, 12, 22, 0, 7, 12, 22], 
  [ 5, 7, 12, 22, 0, 5, 12, 22]
]

I would like to sort it by rows putting the zeros in each row first without changing the order of the other elements along the row.

My desired output is the following:

[
  [ 0, 0, 0, 5, 11, 22, 11, 22], 
  [ 0, 1, 4, 11, 20, 4, 11, 20], 
  [ 0, 1, 6, 11, 22, 1, 11, 22], 
  [ 0, 4, 7, 12, 21, 4, 12, 21], 
  [ 0, 5, 7, 12, 22, 7, 12, 22], 
  [ 0, 5, 7, 12, 22, 5, 12, 22]
]

For a matter of efficiency I am required to do it using numpy (so switching to Python's regular nested lists and doing calculations on them is discouraged). The faster the code, the better.

How could I do that?

Best, Andrea

Is a loop over rows allowed?

>>> a
array([[ 0,  5, 11, 22,  0,  0, 11, 22],
       [ 1,  4, 11, 20,  0,  4, 11, 20],
       [ 1,  6, 11, 22,  0,  1, 11, 22],
       [ 4,  7, 12, 21,  0,  4, 12, 21],
       [ 5,  7, 12, 22,  0,  7, 12, 22],
       [ 5,  7, 12, 22,  0,  5, 12, 22]])
>>> for row in a:
...     row[:] = np.r_[row[row == 0], row[row != 0]]
...     
>>> a
array([[ 0,  0,  0,  5, 11, 22, 11, 22],
       [ 0,  1,  4, 11, 20,  4, 11, 20],
       [ 0,  1,  6, 11, 22,  1, 11, 22],
       [ 0,  4,  7, 12, 21,  4, 12, 21],
       [ 0,  5,  7, 12, 22,  7, 12, 22],
       [ 0,  5,  7, 12, 22,  5, 12, 22]])

This approach gets a binary array of where your array is zero and non-zero, then gets the sort index for that, then applies that to the original array.

You'll need an array as big as your to-be-sorted array to hold the index, but since it's all numpy operations it might be faster than looping.

ind = (a>0).astype(int)
ind = ind.argsort(axis=1)
a[np.arange(ind.shape[0])[:,None], ind]

output:

>>> a
array([[ 0,  0,  0,  5, 11, 22, 11, 22],
       [ 0,  1,  4, 11, 20,  4, 11, 20],
       [ 0,  1,  6, 11, 22,  1, 11, 22],
       [ 0,  4,  7, 12, 21,  4, 12, 21],
       [ 0,  5,  7, 12, 22,  7, 12, 22],
       [ 0,  5,  7, 12, 22,  5, 12, 22]])

maybe not the most efficient since it loops on the line, but maybe a good starting point:

import numpy as np

a = np.array([[ 0,  5, 11, 22,  0,  0, 11, 22],
             [ 1,  4, 11, 20,  0,  4, 11, 20],
             [ 1,  6, 11, 22,  0,  1, 11, 22],
             [ 4,  7, 12, 21,  0,  4, 12, 21],
             [ 5,  7, 12, 22,  0,  7, 12, 22],
             [ 5,  7, 12, 22,  0,  5, 12, 22]])

size = a.shape[1]

for i, line in enumerate(a):
    nz = np.nonzero(a[i][:])[0]
    z = np.zeros(size - nz.shape[0])
    a[i][:] = np.concatenate((z,a[i][:][np.nonzero(a[i][:])]))

For each line in a , you find the nonzero indices and prepend some zeros to match the size.

It is possible to get rid of all the Python looping, building a boolean mask with the help of np.tile and np.repeat , although you will have to time it on some larger example to see if it is worth the extra complexity:

rows, cols = a.shape
mask = a != 0
nonzeros_per_row = mask.sum(axis=1)
repeats = np.column_stack((cols-nonzeros_per_row, nonzeros_per_row)).ravel()
new_mask = np.repeat(np.tile([False, True], rows), repeats).reshape(rows, cols)
out = np.zeros_like(a)
out[new_mask] = a[mask]

>>> a
array([[ 0,  5, 11, 22,  0,  0, 11, 22],
       [ 1,  4, 11, 20,  0,  4, 11, 20],
       [ 1,  6, 11, 22,  0,  1, 11, 22],
       [ 4,  7, 12, 21,  0,  4, 12, 21],
       [ 5,  7, 12, 22,  0,  7, 12, 22],
       [ 5,  7, 12, 22,  0,  5, 12, 22]])
>>> out
array([[ 0,  0,  0,  5, 11, 22, 11, 22],
       [ 0,  1,  4, 11, 20,  4, 11, 20],
       [ 0,  1,  6, 11, 22,  1, 11, 22],
       [ 0,  4,  7, 12, 21,  4, 12, 21],
       [ 0,  5,  7, 12, 22,  7, 12, 22],
       [ 0,  5,  7, 12, 22,  5, 12, 22]])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM