简体繁体 English

cuSolverDN或其他CUDA库是否具有针对稠密矩阵的QR分解的批处理版本以解决A * x = b？

[英]Does cuSolverDN or another CUDA library have a batched-version of QR decomposition for dense matrices to solve A*x = b?

原文 2017-07-14 00:10:45 4 1 cuda/ cusolver/ qr-decomposition

I'm trying to solve A*x = b where A has complex values and is dense. 我正在尝试求解A * x = b，其中A具有复杂的值并且很密集。

I used cusolverDnCgeqrf() method from cuSolverDN library to do the QR decomposition for one linear set of equations. 我使用了cuSolverDN库中的cusolverDnCgeqrf（）方法对一组线性方程组进行QR分解。 However, I want to do this several times to speed up the processing. 但是，我想多次执行此操作以加快处理速度。

Is there a "batched" version of this method? 是否有此方法的“批处理”版本？ Or is there another CUDA library I can use? 还是我可以使用另一个CUDA库？

1 个解决方案

You can use Magma batched QR: http://icl.cs.utk.edu/projectsfiles/magma/doxygen/group__group__qr__batched.html#details 您可以使用Magma批处理QR： http : //icl.cs.utk.edu/projectsfiles/magma/doxygen/group__group__qr__batched.html#details

Or Nvidia batched library: https://devblogs.nvidia.com/parallelforall/parallel-direct-solvers-with-cusolver-batched-qr/ 或Nvidia批处理库： https : //devblogs.nvidia.com/parallelforall/parallel-direct-solvers-with-cusolver-batched-qr/

I am not sure if there are python wrappers for them yet. 我不确定是否有适合他们的python包装器。 I want to add that batched version of many solvers are currently available, either through Magma or Nvidia. 我想补充一下，目前可以通过Magma或Nvidia获得许多求解器的批处理版本。

There is not a single standard yet, but it is underway, it is discussed in batched blas workshops: here 目前还没有一个单一的标准，但是它正在进行中，在blas研讨会上进行了讨论：这里

http://www.netlib.org/utk/people/JackDongarra/WEB-PAGES/Batched-BLAS-2017/ and here: http://www.netlib.org/utk/people/JackDongarra/WEB-PAGES/Batched-BLAS-2017/和此处：

http://www.netlib.org/utk/people/JackDongarra/WEB-PAGES/Batched-BLAS-2016/ http://www.netlib.org/utk/people/JackDongarra/WEB-PAGES/Batched-BLAS-2016/

The draft is ready and I hope there would be a standard Batched BLAS soon. 草案已经准备就绪，我希望很快会有一个标准的批处理BLAS。