简体   繁体   English

arm mali midgard gpus 上的零复制缓冲区分配?

[英]Zero copy buffer allocation on arm mali midgard gpus?

I wish to have zero copy behaviour for opencl buffers on arm mali midgard gpus and arm cpus such that a vector's data pointer and a clBuffer points to the same location for their lifetime.我希望 arm mali midgard gpus 和 arm cpus 上的 opencl 缓冲区具有零复制行为,以便向量的数据指针和 clBuffer 在其生命周期内指向相同的位置。

Some of the things which I tried.我尝试过的一些事情。 I wrote a custom allocator (64 byte alignement) for a vector and then I tried to use cl_arm_import_memory function and pass the vector's pointer to the function.我为向量编写了一个自定义分配器(64 字节对齐),然后我尝试使用 cl_arm_import_memory function 并将向量的指针传递给 function。 But the issue is when I query the device EXT properties, I just see the cl_arm_import_memory string and not the cl_arm_import_memory_host string.但问题是当我查询设备 EXT 属性时,我只看到 cl_arm_import_memory 字符串而不是 cl_arm_import_memory_host 字符串。

I have also tried to first allocate a gpu side buffer and then force a vector to point to the buffer's location.我还尝试先分配一个 gpu 侧缓冲区,然后强制一个向量指向缓冲区的位置。 But according to the Mali guide, a gpu side buffer's location might change such that it might point to separate addresses during multiple mappings.但根据 Mali 指南,gpu 侧缓冲区的位置可能会发生变化,因此它可能会在多个映射期间指向不同的地址。

So, my question is what is the best way to achieve zero copy behaviour between a std::vector and and OpenCL buffer.所以,我的问题是在 std::vector 和 OpenCL 缓冲区之间实现零复制行为的最佳方法是什么。

I think you're mixing two unrelated concepts, zero copy and shared virtual memory.我认为你混合了两个不相关的概念,零拷贝和共享虚拟 memory。 Zero copy does not guarantee that a piece of physical memory will be visible at the same address in both CPU and GPU - they can be mapped differently in CPU's and GPU's virtual address space.零拷贝并不能保证一块物理 memory 在 CPU 和 GPU 的同一地址处可见 - 它们可以在 CPU 和 GPU 的虚拟地址空间中以不同方式映射。 If you want the physical memory to have the same virtual address in GPU and CPU, you need shared virtual memory (SVM).如果希望物理 memory 在 GPU 和 CPU 中具有相同的虚拟地址,则需要共享虚拟 memory (SVM)。 This requires OpenCL 2.x and allocating buffers through clSVMAlloc() .这需要 OpenCL 2.x 并通过clSVMAlloc()分配缓冲区。 If your vendor doesn't provide OpenCL 2.x only 1.x then you're out of luck - you can have zero copy buffers, but not SVM.如果您的供应商仅提供 1.x 的 OpenCL 2.x,那么您就不走运了 - 您可以拥有零复制缓冲区,但不能拥有 SVM。

Try this:尝试这个:

  1. Create Buffer with CL_MEM_ALLOC_HOST_PTR.使用 CL_MEM_ALLOC_HOST_PTR 创建缓冲区。
  2. Call clEnqueueMapBuffer to get a host side pointer.调用 clEnqueueMapBuffer 以获取主机端指针。

Sample code:示例代码:

deviceBuffer = clCreateBuffer(cl->context,
                          CL_MEM_READ_WRITE | CL_MEM_ALLOC_HOST_PTR,
                          sizeof(T) * dataLength,
                          nullptr,
                          &error); checkError(error);

hostPtr = (T *) clEnqueueMapBuffer(cl->memCmdQueue,
                               zeroCopyMem.deviceBuffer,
                               CL_TRUE,
                               CL_MAP_WRITE,
                               0,
                               sizeof(T) * dataLength,
                               0, NULL, NULL, &error);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM