简体   繁体   English

如何在C#中使用opencl实现矩阵乘法

[英]How do I implement matrix multiplication using opencl in C#

Can someone please guide me on how can I perform matrix multiplication in C# to use the GPU using opencl. 有人可以指导我如何在C#中执行矩阵乘法以通过opencl使用GPU。

I have looked at opencl example here: https://www.codeproject.com/Articles/1116907/How-to-Use-Your-GPU-in-NET 我在这里看过opencl示例: https ://www.codeproject.com/Articles/1116907/How-to-Use-Your-GPU-in-NET

But I am not sure how to proceed for matrix multiplication. 但是我不确定如何进行矩阵乘法。

yes as say doqtor, you need to flatten into 1D. 是的,如doqtor所说,您需要展平为1D。 So i have an example to use more args : 所以我有一个使用更多参数的例子:

class Program
{
    static string CalculateKernel
    {
        get
        {
            return @"
            kernel void Calc(global int* m1, global int* m2, int size) 
            {
                for(int i = 0; i < size; i++)
                {
                    printf("" %d / %d\n"",m1[i],m2[i] );
                }
            }";
        }
    }

static void Main(string[] args)
    {

        int[] r1 = new int[]
            {1, 2, 3, 4};

        int[] r2 = new int[]
            {4, 3, 2, 1};

        int rowSize = r1.Length;

        // pick first platform
        ComputePlatform platform = ComputePlatform.Platforms[0];
        // create context with all gpu devices
        ComputeContext context = new ComputeContext(ComputeDeviceTypes.Gpu,
            new ComputeContextPropertyList(platform), null, IntPtr.Zero);

        // create a command queue with first gpu found
        ComputeCommandQueue queue = new ComputeCommandQueue(context,
            context.Devices[0], ComputeCommandQueueFlags.None);

        // load opencl source and
        // create program with opencl source
        ComputeProgram program = new ComputeProgram(context, CalculateKernel);

        // compile opencl source
        program.Build(null, null, null, IntPtr.Zero);

        // load chosen kernel from program
        ComputeKernel kernel = program.CreateKernel("Calc");

        // allocate a memory buffer with the message (the int array)
        ComputeBuffer<int> row1Buffer = new ComputeBuffer<int>(context,
            ComputeMemoryFlags.ReadOnly | ComputeMemoryFlags.UseHostPointer, r1);

        // allocate a memory buffer with the message (the int array)
        ComputeBuffer<int> row2Buffer = new ComputeBuffer<int>(context,
            ComputeMemoryFlags.ReadOnly | ComputeMemoryFlags.UseHostPointer, r2);


        kernel.SetMemoryArgument(0, row1Buffer); // set the integer array
        kernel.SetMemoryArgument(1, row2Buffer); // set the integer array
        kernel.SetValueArgument(2, rowSize); // set the array size

            // execute kernel
        queue.ExecuteTask(kernel, null);

        // wait for completion
        queue.Finish();

        Console.WriteLine("Finished");
        Console.ReadKey();
    }

another sample with the reading of result from gpubuffer: 从gpubuffer读取结果的另一个示例:

class Program
{
    static string CalculateKernel
    {
        get
        {
            // you could put your matrix algorithm here an take the result in array m3
            return @"
            kernel void Calc(global int* m1, global int* m2, int size, global int* m3) 
            {
                for(int i = 0; i < size; i++)
                {
                    int val = m2[i];
                    printf("" %d / %d\n"",m1[i],m2[i] );
                    m3[i] = val * 4;
                }
            }";
        }
    }

static void Main(string[] args)
    {

        int[] r1 = new int[]
            {8, 2, 3, 4};

        int[] r2 = new int[]
            {4, 3, 2, 5};

        int[] r3 = new int[4];
        int rowSize = r1.Length;

        // pick first platform
        ComputePlatform platform = ComputePlatform.Platforms[0];
        // create context with all gpu devices
        ComputeContext context = new ComputeContext(ComputeDeviceTypes.Gpu,
            new ComputeContextPropertyList(platform), null, IntPtr.Zero);

        // create a command queue with first gpu found
        ComputeCommandQueue queue = new ComputeCommandQueue(context,
            context.Devices[0], ComputeCommandQueueFlags.None);

        // load opencl source and
        // create program with opencl source
        ComputeProgram program = new ComputeProgram(context, CalculateKernel);

        // compile opencl source
        program.Build(null, null, null, IntPtr.Zero);

        // load chosen kernel from program
        ComputeKernel kernel = program.CreateKernel("Calc");

        // allocate a memory buffer with the message (the int array)
        ComputeBuffer<int> row1Buffer = new ComputeBuffer<int>(context,
            ComputeMemoryFlags.ReadOnly | ComputeMemoryFlags.UseHostPointer, r1);

        // allocate a memory buffer with the message (the int array)
        ComputeBuffer<int> row2Buffer = new ComputeBuffer<int>(context,
            ComputeMemoryFlags.ReadOnly | ComputeMemoryFlags.UseHostPointer, r2);

        // allocate a memory buffer with the message (the int array)
        ComputeBuffer<int> resultBuffer = new ComputeBuffer<int>(context,
            ComputeMemoryFlags.ReadOnly | ComputeMemoryFlags.UseHostPointer, new int[4]);


        kernel.SetMemoryArgument(0, row1Buffer); // set the integer array
        kernel.SetMemoryArgument(1, row2Buffer); // set the integer array
        kernel.SetValueArgument(2, rowSize); // set the array size
        kernel.SetMemoryArgument(3, resultBuffer); // set the integer array
        // execute kernel
        queue.ExecuteTask(kernel, null);

        // wait for completion
        queue.Finish();

        GCHandle arrCHandle = GCHandle.Alloc(r3, GCHandleType.Pinned);
        queue.Read<int>(resultBuffer, true, 0, r3.Length, arrCHandle.AddrOfPinnedObject(), null);

        Console.WriteLine("display result from gpu buffer:");
        for (int i = 0; i<r3.Length;i++)
            Console.WriteLine(r3[i]);

        arrCHandle.Free();
        row1Buffer.Dispose();
        row2Buffer.Dispose();
        kernel.Dispose();
        program.Dispose();
        queue.Dispose();
        context.Dispose();

        Console.WriteLine("Finished");
        Console.ReadKey();
    }
}

you just adapt the kernel program to calculate the multiplication of 2 matrix 您只需修改内核程序即可计算2个矩阵的乘法

result of last program: 上一个程序的结果:

 8 / 4
 2 / 3
 3 / 2
 4 / 5
display result from gpu buffer:
16
12
8
20
Finished

to flatten 2d to 1d its really easy take this sample: 将2d展平为1d的确很容易,请使用以下示例:

        int[,] twoD = { { 1, 2,3 }, { 3, 4,5 } };
        int[] oneD = twoD.Cast<int>().ToArray();

and see this link to do 1D -> 2D 并查看此链接以执行1D-> 2D

I found a very good reference source for using OpenCL with dot Net. 我发现将OpenCL与点网一起使用的很好的参考资料。

This site is well structured and very useful. 该网站结构合理,非常有用。 It also has matrix multiplication case study example. 它还具有矩阵乘法的案例研究示例。

OpenCL Tutorial OpenCL教程

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM