简体繁体 English

Vulkan-1个统一缓冲区，N个网格-动态VkDeviceMemory

[英]Vulkan - 1 uniform buffer, N meshes - dynamic VkDeviceMemory

原文 2018-11-01 16:09:24 8 3 buffer/ vulkan/ memory-reallocation

Suppose I'm rendering simple cubes at random positions. 假设我要在随机位置渲染简单的多维数据集。

Having 3 of them as the starting number of cubes, the application acquires a VkBuffer handle and binds it to a VkDeviceMemory in order to store the model matrices of all cubes consecutively in it, and which is later on accessed by the shader via the descriptor set. 应用程序将其中3个作为多维数据集的起始数量，该应用程序获取VkBuffer句柄并将其绑定到VkDeviceMemory ，以便在其中连续存储所有多维数据集的模型矩阵，然后由着色器通过描述符集进行访问。。 The VkDeviceMemory has just enough memory for those 3 cubes. VkDeviceMemory有足够的内存来容纳这3个多维数据集。

What I want to do is, every time the user presses a key, a new cube should pop up somewhere. 我想做的是，每当用户按下一个键时，一个新的多维数据集应该在某处弹出。 My question is, how should I go about resizing that memory? 我的问题是，我应该如何调整内存大小？ Could you provide an overview of the steps I should go through? 您能否概述我应该执行的步骤？

I realize I could use separate VkBuffer / VkDeviceMemory for each cube but I do not want to do that. 我意识到我可以为每个多维数据集使用单独的VkBuffer / VkDeviceMemory ，但我不想这样做。 Everywhere I read it is stated that's sort of an anti-pattern. 我到处阅读的内容都说是一种反模式。

Should I just discard the VkDeviceMemory , allocate a new one with the right size, and call it a day? 我应该只丢弃VkDeviceMemory ，分配一个合适大小的新设备，然后每天调用它吗？ What about descriptor sets, do they need any special handling? 描述符集怎么样，它们需要任何特殊处理吗？

In some places I have read you could allocate a very big chunk of data, so you are on the safe side while dealing with more and more cubes up to a point in which, I suppose, you would stop permitting more of them to pop up because a limit has been reached. 在某些地方，我已经读到可以分配很大的数据块，因此在处理越来越多的多维数据集到一定程度的安全方面，我想您会停止允许其中的更多数据弹出因为已经达到了极限。 Is there a way around this self-imposed limit? 有没有办法解决这个自我施加的限制？

EDIT: I also realize allocating one small chunk at a time is a bad idea. 编辑：我也意识到一次分配一个小块是一个坏主意。 What I'm interested in is the reallocation itself, and what it entails. 我感兴趣的是重新分配本身及其所需要的。

3 个解决方案

To answer the question "how do I reallocate and start using the new memory", ignoring questions about allocation strategy: reallocation is no different than allocating a new thing, populating it with the data you want, and then starting to use it. 要回答“如何重新分配并开始使用新的内存”这一问题，请忽略有关分配策略的问题：重新分配与分配新事物没什么不同，新事物要用所需的数据填充然后再开始使用。 So you need essentially all of the same steps as for your initial allocation. 因此，您基本上需要执行与初始分配相同的所有步骤。

The thing to be aware of is that most objects that get referenced in a command buffer can't safely be modified until that command buffer finishes executing. 要注意的是，在命令缓冲区中执行完引用之前，大多数不能安全修改在命令缓冲区中引用的对象。 Typically, you'll be recording commands for frame N+1 while the commands for frame N are still executing. 通常，您将在帧N + 1的命令仍在执行的同时记录命令。 So you want to avoid updating mutable objects (like descriptor sets) to start using the new allocation; 因此，您要避免更新可变对象（如描述符集）以开始使用新分配。 instead, you want a new descriptor set. 相反，您需要一个新的描述符集。

So here's the list of things you need: 所以这是您需要的东西清单：

The buffer itself: a VkBuffer and a VkDeviceMemory . 缓冲区本身： VkBuffer和VkDeviceMemory 。 If you allocated extra space in your current VkDeviceMemory so it's big enough for both the old and new VkBuffer s, then you don't need a new VkDeviceMemory object. 如果您在当前的VkDeviceMemory分配了额外的空间，以至于新旧VkBuffer都足够大，则不需要新的VkDeviceMemory对象。 Either way, create a new VkBuffer of the desired size, and bind it to an unused portion of a VkDeviceMemory object. 无论哪种方式，都需要创建一个所需大小的新VkBuffer ，并将其绑定到VkDeviceMemory对象的未使用部分。
A way to bind the buffer to the pipeline: a VkDescriptorSet . 将缓冲区绑定到管道的一种方法： VkDescriptorSet 。 You'll use the same descriptor set layout as before, that doesn't change. 您将使用与以前相同的描述符集布局，但不会更改。 So allocate a new descriptor set from your descriptor pool, and use vkUpdateDescriptorSet to set the buffer descriptor to point to your new buffer (you can also copy other descriptors from your previous descriptor set if they don't need to change). 因此，从您的描述符池中分配一个新的描述符集，并使用vkUpdateDescriptorSet设置缓冲区描述符以指向您的新缓冲区（如果不需要更改，您也可以从以前的描述符集中复制其他描述符）。
Finally, when building the command buffer for the frame where you want to use the new buffer, pass the new descriptor set to vkCmdBindDescriptorSets instead of the old one. 最后，在要使用新缓冲区的帧上构建命令缓冲区时，请将新的描述符集传递给vkCmdBindDescriptorSets而不是旧的描述符集。
Eventually, after all the command buffers that used the old buffer and descriptor set have finished, you can free the buffer and descriptor set. 最终，在使用旧缓冲区和描述符集的所有命令缓冲区完成之后，您可以释放缓冲区和描述符集。 For the descriptor set, you might just return it to the pool, or keep it around and reuse it the next time you need to reallocate the buffer. 对于描述符集，您可能只是将其返回到池中，或者保留它并在下次需要重新分配缓冲区时重用它。 The device memory used by the old buffer can then be deallocated, or you can keep it around for reuse later. 然后可以释放旧缓冲区使用的设备内存，或者可以保留它以备后用。

Agree with what Jherico said, but there's an additional option, which is to not constrain yourself to a single VkBuffer . 同意Jherico所说的，但是还有一个附加的选择，那就是不要将自己VkBuffer单个VkBuffer 。

Typically you want to think about VkDeviceMemory in multiples of memory pages (4 KiB), and some devices even like multiples of 64 KiB. 通常，您想在内存页面的倍数（4 KiB）中考虑VkDeviceMemory ，有些设备甚至要在64 KiB的倍数中考虑。 Even if you allocate something smaller than that, you'll very likely be using up that much memory, since the OS kernel can't give you things in smaller chunks. 即使分配的内存少于该值，由于OS内核无法为您分配较小的内容，因此很有可能会用完这么多的内存。

So if each transform needs 64 B, then you might just plan to allocate in chunks of 1k transforms. 因此，如果每个转换需要64 B，那么您可能只打算分配1k转换的块。 Allocate one 64 KiB VkBuffer / VkDeviceMemory pair, and when it fills up allocate a second pair, when that fills up allocate a third pair, etc. 分配一个64 KiB VkBuffer / VkDeviceMemory对，并在填充时分配第二对，在填充时分配第三对， VkBuffer VkDeviceMemory 。

When you go to draw, you'll need a separate draw call for each chunk, with buffer rebinding in between. 进行绘制时，每个块都需要一个单独的绘制调用，中间需要重新绑定缓冲区。 If you find that in practice you end up drawing huge numbers of cubes and the number of draw calls and state changes is limiting performance, use a larger chunk size -- you're going to use the memory anyway, so allocating it in small increments isn't helping anything. 如果发现在实践中最终绘制了大量的多维数据集，并且绘制调用和状态更改的数量限制了性能，请使用更大的块大小-无论如何都要使用内存，因此请以较小的增量进行分配没有任何帮助。

If you do this, then every time you allocate a new chunk, you need a new descriptor set for it. 如果这样做，那么每次分配新块时，都需要为其设置新的描述符。 Create that at the same time, and then between draws just bind to the descriptor set for the buffer you're about to use. 同时创建一个，然后在两次绘制之间仅绑定到要使用的缓冲区的描述符集。

If instead you reallocate buffers, then you either need to wait for previous rendering to finish and update the descriptor set you have before drawing with the reallocated buffer, or you can create a new descriptor set and draw immediately, and then later recycle the old descriptor set when you know the drawing that used it is complete. 如果改为重新分配缓冲区，则您需要等待之前的渲染完成并更新具有的描述符集，然后再使用重新分配的缓冲区进行绘制，或者可以创建一个新的描述符集并立即进行绘制，然后再回收旧的描述符当您知道使用它的图形完成时进行设置。

The VkDeviceMemory has just enough memory for those 3 cubes. VkDeviceMemory有足够的内存来容纳这3个多维数据集。

Why? 为什么？ If you want to support an arbitrary number of cubes, then you should be managing your memory such that you can deal with changes in numbers of things like transforms with a minimum number of re-allocations. 如果要支持任意数量的多维数据集，则应该管理内存，以便可以用最少的重新分配数量处理诸如转换之类的事物变化。

Should I just discard the VkDeviceMemory, allocate a new one with the right size, and call it a day? 我是否应该只丢弃VkDeviceMemory，分配一个合适大小的新设备，然后每天调用它？

For structures where the number is variable, you should do allocation for your current needs and for likely future needs as well. 对于数量可变的结构，您应该为当前需求以及未来可能的需求进行分配。 At the same time, you don't want to over-allocate wildly. 同时，您也不想过度分配。 For things like transforms, which are very small in relation to the amount of memory typically available on modern GPUs, it's not unreasonable to start off by allocating for, say, 1024 unique transforms. 对于诸如变换这样的事情，相对于现代GPU上通常可用的内存量而言，它很小，因此可以通过分配1024个唯一的变换开始而不是不合理的。 If it's a simple mat4 transform, it only consumes 512 bytes, so 1k of them is only going to be half a megabyte of memory. 如果这是一个简单的mat4转换，则仅消耗512个字节，因此其中的1k仅占内存的一半。 This is trivial in comparison to typical memory loads for textures or even complex meshes. 与用于纹理甚至复杂网格的典型内存负载相比，这是微不足道的。

If you actually end up consuming them all, you can reallocate for more. 如果您实际上最终消耗了所有资源，则可以重新分配更多资源。 Depending on your likely usage pattern you can either reallocate with a fixed block size like 1024, or do exponentially increasing allocation, like always allocating twice your current side. 根据您可能的使用模式，您可以以固定的块大小（例如1024）重新分配，也可以以指数方式增加分配，例如始终分配当前边的两倍。 You can google vector reallocation for more information on strategies for dealing with contiguous memory that might grow beyond it's current bounds. 您可以通过谷歌vector reallocation获得更多有关处理可能超出其当前范围的连续内存的策略的信息。 Here's an article on the topic 这是有关该主题的文章