矩阵乘法优化

Question

I am performing a series of matrix multiplications with fairly large matrices. 我正在使用相当大的矩阵执行一系列矩阵乘法。 To run through all of these operations takes a long time, and I need my program to do this in a large loop. 要运行所有这些操作需要很长时间，我需要我的程序在大循环中执行此操作。 I was wondering if anyone has any ideas to speed this up? 我想知道是否有人有任何想法加快这一点？ I just started using Eigen, so I have very limited knowledge. 我刚开始使用Eigen，所以我的知识非常有限。

I was using ROOT-cern's built in TMatrix class, but the speed for performing the matrix operations is very poor. 我使用的是ROOT-cern内置的TMatrix类，但执行矩阵操作的速度非常差。 I set up some diagonal matrices using Eigen with the hope that it handled the multiplication operation in a more optimal way. 我使用Eigen设置了一些对角矩阵，希望它以更优化的方式处理乘法运算。 It may, but I cannot really see the performance difference. 它可能，但我不能真正看到性能差异。

// setup matrices
int size = 8000;

Eigen::MatrixXf a(size*2,size);

// fill matrix a....

Eigen::MatrixXf r(2*size,2*size); // diagonal matrix of row sums of a

// fill matrix r

Eigen::MatrixXf c(size,size); // diagonal matrix of col sums of a

// fill matrix c

// transpose a in place
a.transposeInPlace();

Eigen::MatrixXf c_dia;
c_dia = c.diagonal().asDiagonal();

Eigen::MatrixXf r_dia;
r_dia = r.diagonal().asDiagonal();

// calc car
Eigen::MatrixXf car;
car = c_dia*a*r_dia;

Answer 1

You are doing way too much work here. 你在这里做的工作太多了。 If you have diagonal matrices, only store the diagonal (and directly use that for products). 如果你有对角矩阵，只存储对角线（并直接用于产品）。 Once you store a diagonal matrix in a square matrix, the information of the structure is lost to Eigen. 一旦将对角矩阵存储在方阵中，结构的信息就会丢失到Eigen。

Also, you don't need to store the transposed variant of a , just use a.transpose() inside a product (that is only a minor issue here ...) 此外，你并不需要保存的换位变a ，只需使用a.transpose()产品内部（即只有一个小问题在这里...）

// setup matrices
int size = 8000;

Eigen::MatrixXf a(size*2,size);

// fill matrix a....
a.setRandom();

Eigen::VectorXf r = a.rowwise().sum(); // diagonal matrix of row sums of a
Eigen::VectorXf c = a.colwise().sum(); // diagonal matrix of col sums of a

Eigen::MatrixXf car = c.asDiagonal() * a.transpose() * r.asDiagonal();

Finally, of course make sure to compile with optimization enabled, and enable vectorization if available (with gcc or clang compile with -O2 -march=native ). 最后，当然要确保在启用优化的情况下进行编译，并启用矢量化（如果可用的话（使用gcc或clang编译时使用-O2 -march=native ）。

矩阵乘法优化

问题描述

1 个解决方案

解决方案1
6 已采纳 2019-05-15 13:31:33

矩阵乘法优化

问题描述

1 个解决方案

解决方案1 6 已采纳 2019-05-15 13:31:33

解决方案1
6 已采纳 2019-05-15 13:31:33