简体   繁体   中英

removing rows and columns of sparse matrix

I have a large sparse matrix (F). I want to be able to do some filtering on it to reduce it a little more. I want to remove all rows/columns that meet the following criteria.

  1. Delete rows in which sum of row is < 50
  2. Delete columns in which sum of columns < 150

I tried the code below which I thought would work with indexing but I get a dimensionality error:

F = F[F.sum(axis=1)>=50][F.sum(axis=0)>=150]
IndexError                                Traceback (most recent call last)
<ipython-input-9-e13bff9f8066> in <module>()
----> 1 F = F[F.sum(axis=1)>=50][F.sum(axis=0)>=150]
/usr/local/lib/python3.7/dist-packages/numpy/matrixlib/defmatrix.py in __getitem__(self, index)
    191 
    192         try:
--> 193             out = N.ndarray.__getitem__(self, index)
    194         finally:
    195             self._getitem = False

IndexError: too many indices for array: array is 2-dimensional, but 3 were indexed

I am still learning Python so appreciate any help with an explanation!

Sparse array .sum() returns a np.matrix instead of an ndarray and it behaves weird with indexing (it's 2d instead of the 1d array you were probably expecting).

Turn it into an array and combine the indexing and it works fine:

F = F[(F.sum(axis=1)>=50).A.flatten(), (F.sum(axis=0)>=150).A.flatten()] .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM