简体   繁体   English

使用c#和opencl的内存对象分配失败

[英]Memory Object Allocation failure using c# and opencl

I am writing an image processing program with the express purpose to alter large images, the one I'm working with is 8165 pixels by 4915 pixels. 我正在编写一个图像处理程序,其目的是更改大图像,我正在使用的程序是8165像素乘4915像素。 I was told to implement gpu processing, so after some research I decided to go with OpenCL. 我被告知要执行gpu处理,因此在进行了一些研究之后,我决定使用OpenCL。 I started implementing the OpenCL C# wrapper OpenCLTemplate. 我开始实现OpenCL C#包装器OpenCLTemplate。

My code takes in a bitmap and uses lockbits to lock its memory location. 我的代码采用位图并使用锁定位来锁定其内存位置。 I then copy the order of each bit into an array, run the array through the openCL kernel, and it inverts each bit in the array. 然后,我将每个位的顺序复制到数组中,通过openCL内核运行该数组,然后反转该数组中的每个位。 I then run the inverted bits back into the memory location of the image. 然后,我将反转的位运行回映像的存储位置。 I split this process into ten chunks so that i can increment a progress bar. 我将此过程分为十个块,以便可以增加进度条。

My code works perfectly with smaller images, but when I try to run it with my big image I keep getting a MemObjectAllocationFailure when trying to execute the kernel. 我的代码可以很好地用于较小的图像,但是当我尝试以较大的图像运行它时,在尝试执行内核时,我总是收到MemObjectAllocationFailure。 I don't know why its doing this and i would appreciate any help in figuring out why or how to fix it. 我不知道为什么要这么做,我很乐意弄清楚为什么或如何解决它。

    using OpenCLTemplate;

    public static void Invert(Bitmap image, ToolStripProgressBar progressBar)
    {
        string openCLInvert = @"
        __kernel void Filter(__global uchar *  Img0,
                             __global float *  ImgF)

        {
            // Gets information about work-item
            int x = get_global_id(0);
            int y = get_global_id(1);

            // Gets information about work size
            int width = get_global_size(0);
            int height = get_global_size(1);

            int ind = 4 * (x + width * y );

            // Inverts image colors
            ImgF[ind]= 255.0f - (float)Img0[ind];
            ImgF[1 + ind]= 255.0f - (float)Img0[1 + ind];
            ImgF[2 + ind]= 255.0f - (float)Img0[2 + ind];

            // Leave alpha component equal
            ImgF[ind + 3] = (float)Img0[ind + 3];
        }";

        //Lock the image in memory and get image lock data
        var imageData = image.LockBits(new Rectangle(0, 0, image.Width, image.Height), ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);

        CLCalc.InitCL();

        for (int i = 0; i < 10; i++)
        {
            unsafe
            {
                int adjustedHeight = (((i + 1) * imageData.Height) / 10) - ((i * imageData.Height) / 10);
                int count = 0;

                byte[] Data = new byte[(4 * imageData.Stride * adjustedHeight)];
                var startPointer = (byte*)imageData.Scan0;

                for (int y = ((i * imageData.Height) / 10); y < (((i + 1) * imageData.Height) / 10); y++)
                {
                    for (int x = 0; x < imageData.Width; x++)
                    {
                        byte* Byte = (byte*)(startPointer + (y * imageData.Stride) + (x * 4));

                        Data[count] = *Byte;
                        Data[count + 1] = *(Byte + 1);
                        Data[count + 2] = *(Byte + 2);
                        Data[count + 3] = *(Byte + 3);
                        count += 4;
                    }
                }

                CLCalc.Program.Compile(openCLInvert);
                CLCalc.Program.Kernel kernel = new CLCalc.Program.Kernel("Filter");
                CLCalc.Program.Variable CLData = new CLCalc.Program.Variable(Data);

                float[] imgProcessed = new float[Data.Length];

                CLCalc.Program.Variable CLFiltered = new CLCalc.Program.Variable(imgProcessed);
                CLCalc.Program.Variable[] args = new CLCalc.Program.Variable[] { CLData, CLFiltered };

                kernel.Execute(args, new int[] { imageData.Width, adjustedHeight });
                CLCalc.Program.Sync();

                CLFiltered.ReadFromDeviceTo(imgProcessed);

                count = 0;

                for (int y = ((i * imageData.Height) / 10); y < (((i + 1) * imageData.Height) / 10); y++)
                {
                    for (int x = 0; x < imageData.Width; x++)
                    {
                        byte* Byte = (byte*)(startPointer + (y * imageData.Stride) + (x * 4));

                        *Byte = (byte)imgProcessed[count];
                        *(Byte + 1) = (byte)imgProcessed[count + 1];
                        *(Byte + 2) = (byte)imgProcessed[count + 2];
                        *(Byte + 3) = (byte)imgProcessed[count + 3];
                        count += 4;
                    }
                }
            }
            progressBar.Owner.Invoke((Action)progressBar.PerformStep);
        }

        //Unlock image
        image.UnlockBits(imageData);
    }

You may have reached a memory allocation limit of your OpenCL driver/device. 您可能已达到OpenCL驱动程序/设备的内存分配限制。 Check the values returned by clGetDeviceInfo . 检查clGetDeviceInfo返回的值。 There is a limit for the size of one single memory object. 一个内存对象的大小是有限制的。 The OpenCL driver may allow the total size of all allocated memory objects to exceed the memory size on your device, and will copy them to/from host memory when needed. OpenCL驱动程序可能会允许所有分配的内存对象的总大小超过设备上的内存大小,并在需要时将它们复制到主机内存中或从主机内存中复制它们。

To process large images, you may have to split them into smaller pieces, and process them separately. 要处理大图像,您可能必须将它们分成较小的部分,然后分别进行处理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM