简体繁体 English

GPGPU 处理与图形 GPU

[英]GPGPU processing vs Graphic GPU

原文 2022-02-20 22:01:16 9 1 c++/ gpu/ game-engine

So I am developing a graphic API, I have already figured out how to create a window and draw pixels, all I'm missing right now is the GPU part.所以我正在开发一个图形 API，我已经想出了如何创建一个 window 和绘制像素，我现在缺少的是 GPU 部分。 Can I use Sycl or OpenCL to replace it?我可以使用 Sycl 或 OpenCL 来替换它吗？ So I want to use Sycl or OpenCL to access gpu and do the calculation part for my graphic.所以我想使用 Sycl 或 OpenCL 访问 gpu 并为我的图形做计算部分。 Is it an efficient way to use GPU?这是使用 GPU 的有效方法吗？

1 个解决方案

Yes you can use OpenCL to do graphics rendering.是的，您可以使用 OpenCL 进行图形渲染。 It is lightning fast when done right - for special applications faster than anything else - but you will have to implement everything from scratch: camera transformation from 3D to 2D screen coordinates, Bresenham algorithm for line/triangle rasterization, and z-buffer.如果做得好，它会快如闪电——对于特殊应用程序来说比其他任何东西都快——但你必须从头开始实现一切：从 3D 到 2D 屏幕坐标的相机转换、用于线/三角形光栅化的 Bresenham 算法和 z 缓冲区。

Think of it that way: An image is just an integer array.可以这样想：图像就是一个 integer 数组。 You can use OpenCL to write color values into the array at corresponding pixel coordinates.您可以使用 OpenCL 将颜色值写入数组中对应的像素坐标。 Then pass it to your graphics API as the framebuffer and draw it on the screen.然后将其作为帧缓冲区传递给您的图形 API 并将其绘制在屏幕上。

The critical parts on the GPU are parallelization and memory access: GPU 的关键部分是并行化和 memory 访问：

Parallelization:并行化：
- For raster graphics, you can parallelize over an array of lines / an array of triangles.对于光栅图形，您可以并行处理一组直线/一组三角形。 Each GPU thread rasterizes a single line/triangle on the frame.每个 GPU 线程光栅化框架上的单条线/三角形。
- For raytracing graphics, you parallelize over the pixels on the frame.对于光线追踪图形，您可以对帧上的像素进行并行处理。 Each GPU thread computes the color for a single pixel.每个 GPU 线程计算单个像素的颜色。
Memory access: The arrays of lines/triangles must be in GPU memory at all time. Memory 访问：线/三角形的 arrays 必须始终在 GPU memory 中。 PCIe data transfer of large arrays from/to the CPU in between frames will make it extremely slow.在帧之间从/向 CPU 传输大型 arrays 的 PCIe 数据会使其非常慢。

The strategy for interactive graphics with OpenCL is as folows:使用 OpenCL 进行交互式图形的策略如下：

Initialize an array of lines / array of triangles in GPU memory. With a separate physics thread on the CPU, you can enqueue OpenCL kernels that compute some physics and update the line/triangle coordinates, all in GPU memory在 GPU memory 中初始化线数组/三角形数组。使用 CPU 上的单独物理线程，您可以将计算某些物理的 OpenCL 内核排队并更新线/三角形坐标，所有这些都在 GPU memory 中
Get camera parameters (rotation matrix from mouse movement, camera position, field of view etc.) on the CPU, and copy them to GPU memory在CPU上获取相机参数（鼠标移动的旋转矩阵，相机position，视野等），并复制到GPU memory
call the rasterization kernels on the GPU that rasterize all lines/triangles in parallel on the frame/z-buffer.调用 GPU 上的光栅化内核，在帧/z 缓冲区上并行光栅化所有线/三角形。
Copy the finished frame to CPU memory and hand it over to the graphics API.将完成的帧复制到CPU memory，交给显卡API。

Disclaimer: I have written my own raster/raytracing graphics engine in OpenCL from scratch.免责声明：我从头开始用 OpenCL 编写了自己的光栅/光线追踪图形引擎。 See here for a real time demo.请在此处查看实时演示。