是否有更好的数据结构来存储组件及其关联实体？

Question

I'm writing a little Entity Component System (ECS) in Javascript (Typescript specifically) and it currently works, but I was wondering if it could be more efficient under the hood.我正在 Javascript （特别是打字稿）中编写一个小实体组件系统（ECS），它目前可以工作，但我想知道它是否可以在引擎盖下更有效。 The way an ECS works is that entities are basically just bags of components. ECS 的工作方式是实体基本上只是组件包。 So, a player entity might have a HealthComponent , PositionComponent , SpriteComponent , etc. Then you can create a RenderingSystem that queries all entities with a PositionComponent and a SpriteComponent and then it renders them.因此，玩家实体可能有HealthComponent 、 PositionComponent 、 SpriteComponent等。然后您可以创建一个RenderingSystem来查询所有具有PositionComponent和SpriteComponent的实体，然后渲染它们。 Like this:像这样：

for (let entity of scene.query(CT.Position, CT.Sprite) {
  // draw entity
}

To make this efficient when querying, rather than iterating through every entity in the scene to see if it has a Position component and a Sprite component every time, what we instead do is that cache it after the first query call and then keep it updated, so every query call can just return us the list of entities, rather than iterating through the entire list of all entities first each time.为了在查询时提高效率，而不是每次都遍历场景中的每个实体以查看它是否具有Position组件和Sprite组件，而是在第一次查询调用后缓存它，然后保持更新，所以每个查询调用都可以只返回实体列表，而不是每次都首先遍历所有实体的整个列表。

So, as an example, the cache might look like this:因此，作为示例，缓存可能如下所示：

{ "6,1,20" => Map(1) }
{ "2,3,1,6" => Map(1) }
{ "2,3" => Map(31) }
{ "9" => Map(5) }
{ "2,8" => Map(5) }
{ "29,24,2" => Map(5) }

// etc..

The numbers refer to the value of the enum values like CT.Position , CT.Sprite , etc. In this case, CT.Position is 2 and CT.Sprite is 3, and there are 31 entities that have those two components.这些数字指的是枚举值的值，例如CT.Position 、 CT.Sprite等。在这种情况下， CT.Position为 2， CT.Sprite为 3，并且有 31 个实体具有这两个组件。 So when querying all entities that have those two components, we can just return that list of entities, rather than computing it each time.因此，当查询具有这两个组件的所有实体时，我们可以只返回该实体列表，而不是每次都计算它。

This all works, but it's not very efficient, because adding (and removing!) an entity to the scene is an O(n) operation and also involves a lot of string splitting and concatenation.这一切都有效，但效率不高，因为向场景中添加（和删除！）实体是一个O(n)操作，并且还涉及大量字符串拆分和连接。 You need to iterate through every item in the cache to see if the entity's list of components is included by that entry.您需要遍历缓存中的每个项目，以查看该条目是否包含实体的组件列表。

Is there any way to improve this to be more like O(log n) or preferably O(1) ?有什么方法可以改进它，使其更像O(log n)或者最好是O(1) ？ Let me know if this is all clear, or if there's any details that need to be clarified.让我知道这一切是否都清楚，或者是否有任何细节需要澄清。

Here's a link to the Typescript Playground URL reproduction example这是Typescript Playground URL 复制示例的链接

Answer 1

I expect that the number of queries in the cache will be pretty small, since each query will be individually tied to a bunch of code that processes the results.我希望缓存中的查询数量会非常少，因为每个查询都将单独绑定到一堆处理结果的代码。 So iterating over the query list and performing some operation for each one won't be that expensive, but if you have problems when adding or removing a whole bunch of entities, then that can certainly be addressed.因此，遍历查询列表并为每个查询列表执行一些操作不会那么昂贵，但如果您在添加或删除一大堆实体时遇到问题，那么当然可以解决。

First the string representation you use for a subset of component types is indeed pretty inefficient.首先，用于组件类型子集的字符串表示确实非常低效。 There are lots of alternatives.有很多选择。 Maybe try something like this:也许尝试这样的事情：

First, assign an integer to each component type (you did this already)首先，将 integer 分配给每个组件类型（您已经这样做了）
Sort the component types in the subset by their integer按 integer 对子集中的组件类型进行排序
Build a string using each integer as a character使用每个 integer 作为字符构建字符串

This representation is not too fancy, but it allows you to quickly get at the component types in a subset using charCodeAt() , and you can use that to test for subsets by walking through both strings simultaneously, or by walking through one while doing a binary search in the other.这种表示并不太花哨，但它允许您使用charCodeAt()快速获取子集中的组件类型，并且您可以使用它通过同时遍历两个字符串或通过遍历其中一个字符串来测试子集另一种是二分查找。

The real improvements, however, would come from grouping entities by the subset of component types that they present.然而，真正的改进将来自按它们呈现的组件类型的子集对实体进行分组。 There are lots of ways.有很多方法。 I think something like this would work for you:我认为这样的事情对你有用：

For each entity, precalculate its component-type-subset string对于每个实体，预先计算其组件类型子集字符串
For each subset in use, maintain the list of cached queries that match that subset.对于正在使用的每个子集，维护与该子集匹配的缓存查询列表。 This list only needs to be modified when you introduce a new query or a new subset.仅当您引入新查询或新子集时才需要修改此列表。
When an entity is added or removed, get the queries for its subset and add or remove it directly to/from the results.添加或删除实体时，获取其子集的查询并将其直接添加到结果中或从结果中删除。
When you get a new query, make a set of the subsets it matches, add it to the query list for those subsets, and check each entity to see if its subset is contained in the match set.当您获得一个新查询时，创建一组它匹配的子集，将其添加到这些子集的查询列表中，并检查每个实体以查看其子集是否包含在匹配集中。

Answer 2

Okay, I think I have a tentative answer to this, in that it seems to be working, but the code is very complex for me to understand, so I'm not sure if this is actually working or if it just seems to be and is actually broken.好的，我想我对此有一个暂定的答案，因为它似乎正在工作，但代码对我来说非常复杂，所以我不确定这是否真的有效，或者它似乎只是实际上是坏的。

So, for the solution, I wanted to maintain the query performance because querying is called for every system for every frame update, so it's executed 1000x more often than entity creation / deletion.因此，对于解决方案，我想保持查询性能，因为每次帧更新都会为每个系统调用查询，因此它的执行频率是实体创建/删除的 1000 倍。 Currently querying works as an amortized O(1) algorithm by first checking if the cache contains this mapping of components.当前，通过首先检查缓存是否包含此组件映射，查询作为分期 O(1) 算法工作。 If it doesn't, it creates this list of entities associated with this grouping of components (archetype), and then henceforth that list is fetched from the cache.如果没有，它会创建与该组件分组（原型）相关联的实体列表，然后从缓存中获取该列表。 The cache is always kept up-to-date.缓存始终保持最新。

The issue in my question was that while it was nice to have an O(1) query operation, it would be desired to have more efficient add and remove operations, as they were O(n*k), where n was the number of distinct query operations (members of the cache) and k was the number of components in the entity.我的问题是，虽然有一个 O(1) 查询操作很好，但希望有更有效的添加和删除操作，因为它们是 O(n*k)，其中 n 是不同的查询操作（缓存的成员），k 是实体中组件的数量。 That is, whenever an entity was created or destroyed, the program would have to iterate through each item in the cache and check if the entity should belong to this query operation.也就是说，无论何时创建或销毁实体，程序都必须遍历缓存中的每个项目并检查该实体是否应属于该查询操作。 If so, add it to that set, and if not, remove it.如果是，则将其添加到该集合中，如果不是，则将其删除。

The idea I had this morning was to implement another cache / mapping.我今天早上的想法是实现另一个缓存/映射。 That is, the original cache mapped from a query component listing (archetype) to the set of entities that held those components.也就是说，原始缓存从查询组件列表（原型）映射到包含这些组件的实体集。 Example:例子：

{ "6,1,20" => Set(1) }
{ "2,3,1,6" => Set(1) }
{ "2,3" => Set(31) }

Let's say 2 referred to the PositionComponent and 3 referred to the SpriteComponent .假设2指的是PositionComponent ，而3指的是SpriteComponent 。 This means that all entities that contain those two components can be found within that set of 31 entities.这意味着包含这两个组件的所有实体都可以在这组 31 个实体中找到。

So, my tentative solution to my original question was to also have a mapping where a list of components corresponds to all cache entries they're a member of.因此，我对原始问题的暂定解决方案是还有一个映射，其中组件列表对应于它们所属的所有缓存条目。 That is, say we have an entity with the following components: 1, 2, 3, 6, 25 .也就是说，假设我们有一个具有以下组件的实体： 1, 2, 3, 6, 25 。 Then its corresponding entry in this cache would look like this:那么它在这个缓存中的对应条目将如下所示：

1,2,3,6,25 => [ "2,3,1,6", "2,3" ]

The first time an entity of that archetype (component listing) is constructed, that list is manually created.第一次构造该原型的实体（组件列表）时，手动创建该列表。 However, afterwards it is simply maintained.但是，之后它只是保持不变。 Then, whenever there is a request to create an entity of that archetype, we can simply query this cache to find out which cache entries we need to modify.然后，每当有创建该原型实体的请求时，我们都可以简单地查询此缓存以找出我们需要修改哪些缓存条目。

That way, instead of having to iterate through the entire cache and then iterate through each cache item to determine if it should be a member, instead we simply query our secondary cache to determine which cache entries it is a member of.这样，我们不必遍历整个缓存，然后遍历每个缓存项以确定它是否应该是成员，而是只需查询二级缓存以确定它是哪些缓存条目的成员。 So, I believe the amortized complexity shrinks from O(n*k) to O(c), where c is the number of cache entries it's a member of.所以，我相信摊销复杂度从 O(n*k) 缩小到 O(c)，其中 c 是它所属的缓存条目的数量。

是否有更好的数据结构来存储组件及其关联实体？

问题描述

2 个解决方案

解决方案1
0 2021-03-03 14:50:40

解决方案2
0 2021-03-03 19:48:42

是否有更好的数据结构来存储组件及其关联实体？

问题描述

2 个解决方案

解决方案1 0 2021-03-03 14:50:40

解决方案2 0 2021-03-03 19:48:42

解决方案1
0 2021-03-03 14:50:40

解决方案2
0 2021-03-03 19:48:42