[英]Using R and Rcpp, how to multiply two matrices that are sparse Matrix::csr/csc format?
The following code works as expected:以下代码按预期工作:
matrix.cpp矩阵.cpp
// [[Rcpp::depends(RcppEigen)]]
#include <RcppEigen.h>
// [[Rcpp::export]]
SEXP eigenMatTrans(Eigen::MatrixXd A){
Eigen::MatrixXd C = A.transpose();
return Rcpp::wrap(C);
}
// [[Rcpp::export]]
SEXP eigenMatMult(Eigen::MatrixXd A, Eigen::MatrixXd B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}
// [[Rcpp::export]]
SEXP eigenMapMatMult(const Eigen::Map<Eigen::MatrixXd> A, Eigen::Map<Eigen::MatrixXd> B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}
This is using the C++ eigen class for matrices, See https://eigen.tuxfamily.org/dox这是使用 C++ eigen class 作为矩阵,参见https://eigen.tuxfamily.org/dox
In R, I can access those functions.在 R 中,我可以访问这些功能。
library(Rcpp);
Rcpp::sourceCpp('matrix.cpp');
A <- matrix(rnorm(10000), 100, 100);
B <- matrix(rnorm(10000), 100, 100);
library(microbenchmark);
microbenchmark(eigenMatTrans(A), t(A), A%*%B, eigenMatMult(A, B), eigenMapMatMult(A, B))
This shows that R performs pretty well on resorting (transpose).这表明 R 在转置(转置)方面表现相当不错。 Multiplying has some advantages with eigen.
乘法与本征有一些优点。
Using the Matrix library, I can convert a normal matrix to a sparse matrix.使用 Matrix 库,我可以将普通矩阵转换为稀疏矩阵。
Example from https://cmdlinetips.com/2019/05/introduction-to-sparse-matrices-in-r/示例来自https://cmdlinetips.com/2019/05/introduction-to-sparse-matrices-in-r/
library(Matrix);
data<- rnorm(1e6)
zero_index <- sample(1e6)[1:9e5]
data[zero_index] <- 0
A = matrix(data, ncol=1000)
A.csr = as(A, "dgRMatrix");
B.csr = t(A.csr);
A.csc = as(A, "dgCMatrix");
B.csc = t(A.csc);
So if I wanted to multiply A.csr times B.csr using eigen, how to do that in C++?因此,如果我想使用 eigen 将 A.csr 乘以 B.csr,如何在 C++ 中做到这一点? I do not want to have to convert types if I don't have to.
如果不需要,我不想转换类型。 It is a memory size thing.
是 memory 大小的东西。
The A.csr %*% B.csr
is not-yet-implemented. A.csr %*% B.csr
尚未实现。 The A.csc %*% B.csc
is working. A.csc %*% B.csc
正在工作。
I would like to microbenchmark the different options, and see how matrix size will be most efficient.我想对不同的选项进行微基准测试,看看矩阵大小如何最有效。 In the end, I will have a matrix that is about 1% sparse and have 5 million rows and cols...
最后,我将有一个大约 1% 稀疏的矩阵,并且有 500 万行和列...
There's a reason that dgRMatrix crossproduct functions are not yet implemented, in fact, they should not be implemented because otherwise they would enable bad practice. dgRMatrix 叉积函数尚未实现是有原因的,事实上,它们不应该被实现,否则它们会导致不好的做法。
There are a few performance considerations when working with sparse matrices:使用稀疏矩阵时有一些性能注意事项:
There may be applications where row-major ordering shines (ie see the work by Dmitry Selivanov on CSR matrices and irlba svd), but this is absolutely not one of them, in fact, so much so you are better off doing in-place conversion to get to a CSC matrix.可能存在行优先排序大放异彩的应用程序(即参见 Dmitry Selivanov 在 CSR 矩阵和 irlba svd 上的工作),但这绝对不是其中之一,事实上,所以你最好进行就地转换得到一个 CSC 矩阵。
tl;dr : column-wise cross-product in row-major matrices is the ultimatum of inefficiency. tl; dr :行主要矩阵中的按列交叉积是低效率的最后通牒。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.