[英]Sparse Matrix Vs Dense Matrix Multiplication C++ Tensorflow
I would like to write in C++ Tensorflow sparse matrix dense vector (SPMv) multiplication: y = Ax我想写在 C++ Tensorflow 稀疏矩阵密集向量(SPMv)乘法:y = Ax
The sparse matrix, A, is stored in CSR format.稀疏矩阵 A 以 CSR 格式存储。 The usual sparsity of A is between 50-90%. A 的通常稀疏度在 50-90% 之间。 The goal is to reach better or similar time than that of dense matrix dense vector (DMv) multiplication.目标是达到比密集矩阵密集向量 (DMv) 乘法更好或相似的时间。
Please note that I have already viewed the following posts: Q1 Q2 Q3 .请注意,我已经查看了以下帖子: Q1 Q2 Q3 。 However, I still am wondering about the following:但是,我仍然想知道以下几点:
This question is relevant to my other question here: ( CSCC: Convolution Split Compression Calculation Algorithm for Deep Neural Network )这个问题与我在这里的另一个问题有关:( CSCC:深度神经网络的卷积拆分压缩计算算法)
To answer the edited question:要回答已编辑的问题:
Beyond the matrix format itself, even the ordering of entries in your matrix can have a massive impact on performance, which is why the Cuthill-McKee algorithm is often used to reduce matrix bandwidth (and thereby improve cache performance).除了矩阵格式本身之外,甚至矩阵中条目的顺序也会对性能产生巨大影响,这就是为什么 Cuthill-McKee 算法经常用于减少矩阵带宽(从而提高缓存性能)的原因。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.