片段着色器中的“交错渲染”

Question

PS Yes, I posted this question on Computer Graphics Stack Exchange. PS 是的，我在 Computer Graphics Stack Exchange 上发布了这个问题。 But posting there also in hope more people will see但是发帖也是希望更多人看到

Intro介绍

I'm trying to render multi-channel images (more than 4 channels, for the purposes of feeding it to a Neural Network).我正在尝试渲染多通道图像（超过 4 个通道，目的是将其提供给神经网络）。 Since OpenGL doesn't support it natively, I have multiple 4-channel render buffers, into which I render a corresponding portion of channels.由于 OpenGL 本身不支持它，我有多个 4 通道渲染缓冲区，我将通道的相应部分渲染到其中。

For example, I need multi-channel image of size 512 x 512 x 16 , in OpenGL I have 4 render buffers of size 512 x 512 x 4 .例如，我需要大小为512 x 512 x 16多通道图像，在 OpenGL 中我有 4 个大小为512 x 512 x 4渲染缓冲区。 Now the problem is that the Neural Network expects the data with strides 512 x 512 x 16 , ie 16 values of channels of one pixel are followed by 16 values of channels from the next pixel.现在的问题是神经网络需要步长为512 x 512 x 16 ，即一个像素的 16 个通道值后面跟着来自下一个像素的 16 个通道值。 However currently I can efficiently read my 4 render buffers via 4 calls to glReadPixels , basically making the data having strides 4 x 512 x 512 x 4 .但是目前我可以通过 4 次调用glReadPixels有效地读取我的 4 个渲染缓冲区，基本上使数据具有4 x 512 x 512 x 4步幅。 Manual reordering of data on the client side will not suffice me as it's too slow.在客户端手动重新排序数据对我来说是不够的，因为它太慢了。

Main question主要问题

I've got an idea to render to a single 4-channel render buffer of size 512*4 x 512 x 4 , because stride-wise it's equivalent to 512 x 512 x 16 , we just treat a combination of 4 pixels in a row as a single pixel of 16-channel output image.我有一个想法渲染到大小为512*4 x 512 x 4的单个 4 通道渲染缓冲区，因为它相当于512 x 512 x 16 ，我们只处理连续 4 个像素的组合作为 16 通道输出图像的单个像素。 Let's call it an "interleaved rendering"我们称之为“交错渲染”

But this requires me to magically adjust my fragment shader, so that every group of consequent 4 fragments would have exactly the same interpolation of vertex attributes.但这需要我神奇地调整我的片段着色器，以便每组后续的 4 个片段将具有完全相同的顶点属性插值。 Is there any way to do that?有没有办法做到这一点？

This bad illustration with 1 render buffer of 1024 x 512 4-channel image, is an example of how it should be rendered.这个带有 1 个1024 x 512 4 通道图像渲染缓冲区的糟糕插图是它应该如何渲染的一个例子。 With that I can in 1 call glReadPixels extract the data with stride 512 x 512 x 8这样我就可以在 1 次调用glReadPixels以512 x 512 x 8步幅提取数据

EDIT: better pictures What I have now (4 render buffers)编辑：更好的图片我现在拥有的（4 个渲染缓冲区）

What I want to do natively in OpenGL (this image is done in Python offline)我想在 OpenGL 中本地做的事情（此图像是在 Python 离线中完成的）

Answer 1

But this requires me to magically adjust my fragment shader, so that every group of consequent 4 fragments would have exactly the same interpolation of vertex attributes.但这需要我神奇地调整我的片段着色器，以便每组后续的 4 个片段将具有完全相同的顶点属性插值。

No, it would require a bit more than that.不，它需要的远不止这些。 You have to fundamentally change how rasterization works.您必须从根本上改变光栅化的工作方式。

Rendering at 4x the width is rendering at 4x the width.以 4 倍宽度渲染就是以 4 倍宽度渲染。 That means stretching the resulting primitives, relative to a square area.这意味着相对于正方形区域拉伸生成的图元。 But that's not the effect you want.但这不是你想要的效果。 You need the rasterizer to rasterize at the original resolution, then replicate the rasterization products.您需要光栅化器以原始分辨率进行光栅化，然后复制光栅化产品。

That's not possible.那是不可能的。

From the comments:来自评论：

It just got to me, that I can try to get a 512 x 512 x 2 image of texture coordinates from vertex+fragment shaders, then stitch it with itself to make 4 times wider (thus we'll get the same interpolation) and from that form the final image它只是让我想到，我可以尝试从顶点 + 片段着色器获得 512 x 512 x 2 的纹理坐标图像，然后将其与自身缝合以使其宽度增加 4 倍（因此我们将获得相同的插值）并从形成最终图像

This is a good idea.这是一个好主意。 You'll need to render whatever interpolated values you need to the original size texture, similar to how deferred rendering works.您需要将所需的任何内插值渲染到原始大小的纹理，类似于延迟渲染的工作方式。 So it may be more than just 2 values.所以它可能不仅仅是 2 个值。 You could just store the gl_FragCoord.xy values, and then use them to compute whatever you need, but it's probably easier to store the interpolated values directly.您可以只存储gl_FragCoord.xy值，然后使用它们来计算您需要的任何值，但直接存储内插值可能更容易。

I would suggest doing a texelFetch when reading the texture, as you can specify exact integer texel coordinates.我建议在读取纹理时执行texelFetch ，因为您可以指定精确的整数 texel 坐标。 The integer coordinates you need can be computed from gl_FragCoord as follows:您需要的整数坐标可以从gl_FragCoord计算如下：

ivec2 texCoords = ivec2(int(gl_FragCoord.x * 0.25f), int(gl_FragCoord.y));

片段着色器中的“交错渲染”

问题描述

Intro介绍

Main question主要问题

1 个解决方案

解决方案1
2 已采纳 2021-10-13 17:16:08

片段着色器中的“交错渲染”

问题描述

Intro介绍

Main question主要问题

1 个解决方案

解决方案1 2 已采纳 2021-10-13 17:16:08

解决方案1
2 已采纳 2021-10-13 17:16:08