简体   繁体   中英

Fast NMF in R on sparse matrices

I'm looking for a fast NMF implementation for sparse matrices in R.

The R NMF package consists of a number of algorithms, none of which impress in terms of computational time.

NNLM::nnmf() seems state of the art in R at the moment, specifically the method = "scd" and loss = "mse" , implemented as alternating least squares solved by sequential coordinate descent. However, this method is quite slow on very large, very sparse matrices.

The rsparse::WRMF function is extremely fast, but that's due to the fact that only positive values in A are used for row-wise computation of W and H .

Is there any reasonable implementation for solving NMF on a sparse matrix?

Is there an equivalent to scikit-learn in R? See this question

There are various worker functions, such as fnnls , tsnnls in R, none of which surpass nnls::nnls (written in Fortran). I have been unable to code any of these functions into a faster NMF framework.

Forgot I even posted this question, but one year later...

I wrote a very fast implementation of NMF in RcppEigen, see the RcppML R package on CRAN.

install.packages("RcppML")

# for the development version
devtools::install_github("zdebruine/RcppML")

?RcppML::nmf

It's at least an order of magnitude faster than NNLM::nnmf and for comparison, RcppML::nmf rivals the runtime of irlba::irlba SVD (although it's an altogether different algorithm).

I've successfully applied my implementation to 1.3 million single-cells containing 26000 genes in a 96% sparse matrix for rank-100 factorization in 1 minute. I think that's very reasonable.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM