將 GPU 上的 OpenGL 圖像從 C++ 轉移到 ZA7F5F35426B927411FC9231B5638217

Question

我在 C++ 中使用 pybind11 接口構建了一個模擬器，以使用 PyTorch 在 Python 中運行深度學習。 在每個時間步，我使用 SFML 庫（openGL 的包裝器）從模擬器的場景中繪制某些東西。 我在紋理上繪制它，然后從該紋理中獲取像素，如下所示：

glBindTexture(GL_TEXTURE_2D, imageTexture.getTexture().getNativeHandle());
glGetTexImage(GL_TEXTURE_2D, 0, GL_RGBA, GL_UNSIGNED_BYTE, img.data());

然后我使用 pybind11 接口將像素向量img從 C++ 移動到 Python 。 問題是這種 GPU 到 CPU 的操作非常慢。 由於 Python 中的向量隨后被轉移回 GPU 以進行快速深度學習 (CNN) 操作，因此我想知道如何避免該步驟。

My best guess so far is that I should at each step bind the texture in C++ (as in the code above), then right after that in the Python get the bound texture using CUDA, while keeping it on the GPU. 但是我不知道該怎么做，我不太了解 GPU 以及 CUDA/OpenGL 的工作原理。 指向正確方向的指針將不勝感激！

Answer 1

您應該為此操作使用 PBO（像素緩沖區對象）。

使用 PBO 的數據傳輸操作非常快

https://www.khronos.org/opengl/wiki/Pixel_Buffer_Object

GLuint w_pbo[2];

 // Create pbo objects and than

 // Do your drawings.

int w_readIndex = 0;
int w_writeIndex = 1;
glReadBuffer(GL_COLOR_ATTACHMENT0);
w_writeIndex = (w_writeIndex + 1) % 2;
w_readIndex = (w_readIndex + 1) % 2;
glBindBuffer(GL_PIXEL_PACK_BUFFER, w_pbo[w_writeIndex]);
// copy from framebuffer to PBO asynchronously. it will be ready in the NEXT frame
glReadPixels(0, 0, SCR_WIDTH, SCR_HEIGHT, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
// now read other PBO which should be already in CPU memory
glBindBuffer(GL_PIXEL_PACK_BUFFER, w_pbo[w_readIndex]);
unsigned char* downsampleData = (unsigned char*)glMapBuffer(GL_PIXEL_PACK_BUFFER, GL_READ_ONLY);

現在您可以使用 unsigned char* downsampleData 構建紋理 memory

將 GPU 上的 OpenGL 圖像從 C++ 轉移到 ZA7F5F35426B927411FC9231B5638217

問題描述

1 個解決方案

解決方案1
2 2022-01-31 12:07:17

將 GPU 上的 OpenGL 圖像從 C++ 轉移到 ZA7F5F35426B927411FC9231B5638217

問題描述

1 個解決方案

解決方案1 2 2022-01-31 12:07:17

解決方案1
2 2022-01-31 12:07:17