简体   繁体   English

C#中的矩阵乘法

[英]Matrix multiplication in C#

I'm having problems multiplying two matrices of dimensions 5000x1024. 我在将两个尺寸为5000x1024的矩阵相乘时遇到问题。 I tried to do it through loops in a conventional manner, but it takes forever. 我试图以常规方式通过循环来完成此操作,但是这要花很多时间。 Is there any good library with implemented and optimized matrix operations, or any algorithm that does it without 3 loops? 是否有任何具有已实现和优化的矩阵运算的好的库,或者没有没有3个循环的算法?

Have you looked into using OpenCL? 您是否考虑过使用OpenCL? One of the examples in the Cloo (C# OpenCL library) distribution is a large 2D matrix multiplication. Cloo(C#OpenCL库)分发中的示例之一是大型2D矩阵乘法。

Unlike CUDA, OpenCL kernels will run on your GPU (if available and supported) or the CPU. 与CUDA不同,OpenCL内核将在您的GPU(如果可用和受支持)或CPU上运行。 On the GPU you'll see really, really, really dramatic speed increases. 在GPU上,您会看到非常非常快的速度提升。 I mean, really dramatic, on the order of 10x-100x depending on how efficient your kernel is and how many cores your GPU has. 我的意思是, 确实非常引人注目,大约10倍至100倍,具体取决于内核的效率和GPU的内核数量。 (A Fermi-based NVidia card will have between 384-512, and the new 600's have something like 1500.) (基于费米的NVidia卡的容量为384-512,而新的600卡的容量为1500。)

If you're not interested in going that route - though anyone that's doing numerically-intensive, easily parallelizable operations like this should be using the GPU - make sure you're at least using C#'s built-in parallelization: 如果您对沿途不感兴趣-尽管正在执行数字密集型,可轻松并行化的操作的任何人都应使用GPU-请确保至少使用C#的内置并行化:

Parallel.For(
  0
  ,5000
  , (i) => { 
    for(var j=0;j<1024;j++)
    {
      result[i,j] = .....
    }
);

also, check out GPU.NET and Brahma. 另外,请查看GPU.NET和Brahma。 Brahma lets you build OpenCL kernels in C# with LINQ. Brahma允许您使用LINQ在C#中构建OpenCL内核。 Definitely lowers the learning curve. 绝对降低学习曲线。

Take a look at Strassen algorithm which has a runtime of approx. 看一看Strassen算法 ,它的运行时间约为 O(n 2.8 ) instead of O(n 3 ) with a naive method of multiplying matrices. 用天真矩阵相乘的方法,用O(n 2.8 )代替O(n 3 )。 One problem is that is not always stable but works fine for really high dimensions. 一个问题是,它并不总是稳定的,但是对于非常大的尺寸却可以正常工作。 Furthermore it is really complex so I would suggest you to rethink your design and maybe decrease the size of the matrix or split your problem into smaller pieces. 此外,它确实很复杂,因此我建议您重新考虑您的设计,并可能减小矩阵的大小或将问题分解为较小的部分。

But keep in mind that a matrix multiplication with no special properties (like Aidan mentioned) is nearly impossible to optimize. 但请记住,几乎没有优化没有特殊属性的矩阵乘法(如提到的Aidan )。 Here an example: the Coppersmith-Winograd algorithm takes O(n 2.3737 ) and it is by far one of the best algorithms for matrix multiplication! 这是一个例子: Coppersmith-Winograd算法的取值为O(n 2.3737 ),它是迄今为止矩阵乘法的最佳算法之一! The best option here would be to either use OpenCL and the GPU (mentioned by David ) or to look at other optimized programming languages like Python with the package numpy . 最好的选择是使用OpenCL和GPU( David提到),或者使用包numpy查看其他优化的编程语言,例如Python。

Good luck either way! 祝你好运!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM