简体繁体 English

在CLR C ++库中使用SIMD

[英]Using SIMD in a CLR C++ library

原文 2014-01-31 19:18:31 8 1 c#/ c++/ .net/ vectorization/ auto-vectorization

C# and Visual Basic and the .NET CLR are excellent development environments for user interfaces and line-of-business applications, etc. C＃和Visual Basic以及.NET CLR是用于用户界面和业务线应用程序等的出色开发环境。

However, I've been writing a lot of code with execution timings that go around O(n^3): n > 1000 , and in a couple of places, higher than that. 但是，我一直在编写许多代码，其执行时间大约为O(n^3): n > 1000 ，并且在几个地方都比这更高。 Basically these loops read from one large array, do a little math and make five or six tests, and write the result to a second array of identical size. 基本上，这些循环从一个大型数组读取，做一些数学运算并进行五六个测试，然后将结果写入大小相同的第二个数组。

Most of it is code that was ported from Intel Fortran programs, in order to bring them into a 64-bit world. 大多数代码是从Intel Fortran程序移植的代码，目的是将它们带入64位世界。 I've noted that without any auto-vectorization of that code, execution times are much slower. 我注意到，如果不对该代码进行任何自动向量化，执行时间就会慢得多。 .NET has no support for use of the SIMD operations found on every Intel processor sold today. .NET不支持使用当今销售的每个英特尔处理器上的SIMD操作。

Since the functions already written in a tight algorithm that can be ported by a skilled programmer, I thought that asking that programmer to port the code to a C++ CLR library might be an approach. 由于已经用严格的算法编写了函数，可以由熟练的程序员移植，所以我认为要求程序员将代码移植到C ++ CLR库可能是一种方法。

Is it possible to get a C++ library that is auto-vectorized and also presents a CLR interface for a C#/VB program to call? 是否有可能获得一个自动矢量化的C ++库，并提供一个CLR接口供C＃/ VB程序调用？
If no, do workarounds exist? 如果没有，是否存在解决方法？ Is a COM interface one such workaround? COM接口就是这样一种解决方法吗？
If yes, what form would it have to take? 如果是，它将采取什么形式？

1 个解决方案

Sure, no problem. 好没问题。 A C++/CLI class library project gives you the way to write a managed wrapper, a ref class , that can directly call native C++ code. C ++ / CLI类库项目为您提供了一种编写托管包装的方法，即ref类 ，该包装可以直接调用本机C ++代码。 Such a class is directly usable by any managed code. 这样的类可直接用于任何托管代码。

VS2012 or higher required to get auto-vectorization and auto-parallelization in native C++ code. 在本机C ++代码中获得自动矢量化和自动并行化所需的VS2012或更高版本。 Designing the interop layer so the number of transitions from managed to unmanaged code and back is minimized can be important. 设计互操作层以使从托管代码到非托管代码以及向后转换的次数最小化非常重要。 In other words, don't copy a single double value at a time. 换句话说，不要一次复制单个double值。