removing rows and columns of sparse matrix

Question

I have a large sparse matrix (F). I want to be able to do some filtering on it to reduce it a little more. I want to remove all rows/columns that meet the following criteria.

Delete rows in which sum of row is < 50
Delete columns in which sum of columns < 150

I tried the code below which I thought would work with indexing but I get a dimensionality error:

F = F[F.sum(axis=1)>=50][F.sum(axis=0)>=150]

IndexError                                Traceback (most recent call last)
<ipython-input-9-e13bff9f8066> in <module>()
----> 1 F = F[F.sum(axis=1)>=50][F.sum(axis=0)>=150]
/usr/local/lib/python3.7/dist-packages/numpy/matrixlib/defmatrix.py in __getitem__(self, index)
    191 
    192         try:
--> 193             out = N.ndarray.__getitem__(self, index)
    194         finally:
    195             self._getitem = False

IndexError: too many indices for array: array is 2-dimensional, but 3 were indexed

I am still learning Python so appreciate any help with an explanation!

Answer 1

Sparse array .sum() returns a np.matrix instead of an ndarray and it behaves weird with indexing (it's 2d instead of the 1d array you were probably expecting).

Turn it into an array and combine the indexing and it works fine:

F = F[(F.sum(axis=1)>=50).A.flatten(), (F.sum(axis=0)>=150).A.flatten()] .

removing rows and columns of sparse matrix

Question

1 answers

solution1
0 2021-05-03 13:43:05

removing rows and columns of sparse matrix

Question

1 answers

solution1 0 2021-05-03 13:43:05

solution1
0 2021-05-03 13:43:05