简体   繁体   English

使用处理器的浮点指令集对double求和

[英]Summation of double using the processor's floating-point instruction set

On a Windows machine with Intel Core-i5, I want to write ac# program which sums an array of double at highest possible speed , which in fact means using the instruction set of the built-in FPU. 在装有Intel Core-i5的Windows计算机上,我想编写ac#程序,该程序以最快的速度求和一个double数组,这实际上意味着使用内置FPU的指令集。

double[] arr = new double[] { 1.123, 2.234, 3.1234, .... };

The processor has a built-in command which can sum up a whole memory array ("vector") with one single command. 处理器具有一个内置命令,该命令可以用一个命令汇总整个内存阵列(“向量”)。 Is there a way in C# to execute the summation with this built-in machine command? C#中有没有办法使用此内置机器命令执行求和? ( I mean, besides writing unmanaged assembly code) (我的意思是,除了编写不受管理的汇编代码外)

EDIT: Or is there a library call which will do this ? 编辑:还是有一个库调用将执行此操作?

No. You can't directly use SSE/AVX/... instructions in C#. 不能。您不能直接使用C#中的SSE / AVX / ...指令。 You could write some C++ code and PInvoke it, but probably the PInvoke cost would remove all the benefits of using these instructions. 您可以编写一些C ++代码并PInvoke,但是PInvoke的成本可能会消除使用这些指令的所有好处。

Technically you can do bad things and call these instructions from C# (see https://stackoverflow.com/a/29646856/613130 ), but they are bad slow things , so you wouldn't probably gain anything speedwise. 从技术上讲,您可以做不好的事情,并从C#调用这些指令(请参阅https://stackoverflow.com/a/29646856/613130 ),但是它们是很慢的事情 ,因此您可能不会很快获得任何东西。

Yes, there are several ways to accomplish this 是的,有几种方法可以做到这一点

double sum = arr.Sum();

which uses Linq to sum the array. 它使用Linq对数组求和。 This is the simplest way, but is not the highest possible speed way. 这是最简单的方法,但不是最快的方法。 You asked about a library call that can do this, HPCsharp is such a library: nuget package available on nuget.org. 您询问了可以执行此操作的库调用,HPCsharp就是这样的库:nuget.org上提供了nuget软件包。 The fastest implementation there is 最快的实现是

double sum = arr.SumSsePar();

which uses SIMD/SSE instructions to get the most performance out of each core, and uses multiple cores to get the highest performance out of the processor. 它使用SIMD / SSE指令从每个内核中获得最佳性能,并使用多个内核从处理器中获得最高性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM