简体   繁体   中英

Vulkan - 1 uniform buffer, N meshes - dynamic VkDeviceMemory

Suppose I'm rendering simple cubes at random positions.

Having 3 of them as the starting number of cubes, the application acquires a VkBuffer handle and binds it to a VkDeviceMemory in order to store the model matrices of all cubes consecutively in it, and which is later on accessed by the shader via the descriptor set. The VkDeviceMemory has just enough memory for those 3 cubes.

What I want to do is, every time the user presses a key, a new cube should pop up somewhere. My question is, how should I go about resizing that memory? Could you provide an overview of the steps I should go through?

I realize I could use separate VkBuffer / VkDeviceMemory for each cube but I do not want to do that. Everywhere I read it is stated that's sort of an anti-pattern.

Should I just discard the VkDeviceMemory , allocate a new one with the right size, and call it a day? What about descriptor sets, do they need any special handling?

In some places I have read you could allocate a very big chunk of data, so you are on the safe side while dealing with more and more cubes up to a point in which, I suppose, you would stop permitting more of them to pop up because a limit has been reached. Is there a way around this self-imposed limit?

EDIT: I also realize allocating one small chunk at a time is a bad idea. What I'm interested in is the reallocation itself, and what it entails.

To answer the question "how do I reallocate and start using the new memory", ignoring questions about allocation strategy: reallocation is no different than allocating a new thing, populating it with the data you want, and then starting to use it. So you need essentially all of the same steps as for your initial allocation.

The thing to be aware of is that most objects that get referenced in a command buffer can't safely be modified until that command buffer finishes executing. Typically, you'll be recording commands for frame N+1 while the commands for frame N are still executing. So you want to avoid updating mutable objects (like descriptor sets) to start using the new allocation; instead, you want a new descriptor set.

So here's the list of things you need:

  1. The buffer itself: a VkBuffer and a VkDeviceMemory . If you allocated extra space in your current VkDeviceMemory so it's big enough for both the old and new VkBuffer s, then you don't need a new VkDeviceMemory object. Either way, create a new VkBuffer of the desired size, and bind it to an unused portion of a VkDeviceMemory object.

  2. A way to bind the buffer to the pipeline: a VkDescriptorSet . You'll use the same descriptor set layout as before, that doesn't change. So allocate a new descriptor set from your descriptor pool, and use vkUpdateDescriptorSet to set the buffer descriptor to point to your new buffer (you can also copy other descriptors from your previous descriptor set if they don't need to change).

  3. Finally, when building the command buffer for the frame where you want to use the new buffer, pass the new descriptor set to vkCmdBindDescriptorSets instead of the old one.

  4. Eventually, after all the command buffers that used the old buffer and descriptor set have finished, you can free the buffer and descriptor set. For the descriptor set, you might just return it to the pool, or keep it around and reuse it the next time you need to reallocate the buffer. The device memory used by the old buffer can then be deallocated, or you can keep it around for reuse later.

Agree with what Jherico said, but there's an additional option, which is to not constrain yourself to a single VkBuffer .

Typically you want to think about VkDeviceMemory in multiples of memory pages (4 KiB), and some devices even like multiples of 64 KiB. Even if you allocate something smaller than that, you'll very likely be using up that much memory, since the OS kernel can't give you things in smaller chunks.

So if each transform needs 64 B, then you might just plan to allocate in chunks of 1k transforms. Allocate one 64 KiB VkBuffer / VkDeviceMemory pair, and when it fills up allocate a second pair, when that fills up allocate a third pair, etc.

When you go to draw, you'll need a separate draw call for each chunk, with buffer rebinding in between. If you find that in practice you end up drawing huge numbers of cubes and the number of draw calls and state changes is limiting performance, use a larger chunk size -- you're going to use the memory anyway, so allocating it in small increments isn't helping anything.

If you do this, then every time you allocate a new chunk, you need a new descriptor set for it. Create that at the same time, and then between draws just bind to the descriptor set for the buffer you're about to use.

If instead you reallocate buffers, then you either need to wait for previous rendering to finish and update the descriptor set you have before drawing with the reallocated buffer, or you can create a new descriptor set and draw immediately, and then later recycle the old descriptor set when you know the drawing that used it is complete.

The VkDeviceMemory has just enough memory for those 3 cubes.

Why? If you want to support an arbitrary number of cubes, then you should be managing your memory such that you can deal with changes in numbers of things like transforms with a minimum number of re-allocations.

Should I just discard the VkDeviceMemory, allocate a new one with the right size, and call it a day?

For structures where the number is variable, you should do allocation for your current needs and for likely future needs as well. At the same time, you don't want to over-allocate wildly. For things like transforms, which are very small in relation to the amount of memory typically available on modern GPUs, it's not unreasonable to start off by allocating for, say, 1024 unique transforms. If it's a simple mat4 transform, it only consumes 512 bytes, so 1k of them is only going to be half a megabyte of memory. This is trivial in comparison to typical memory loads for textures or even complex meshes.

If you actually end up consuming them all, you can reallocate for more. Depending on your likely usage pattern you can either reallocate with a fixed block size like 1024, or do exponentially increasing allocation, like always allocating twice your current side. You can google vector reallocation for more information on strategies for dealing with contiguous memory that might grow beyond it's current bounds. Here's an article on the topic

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM