简体   繁体   English

std::unordered_set 是否连续(如 std::vector)?

[英]Is std::unordered_set contiguous (like std::vector)?

I'm storing pointers in std::unordered_set.我将指针存储在 std::unordered_set 中。 I do this because I don't want any duplicates (I delete the pointers in the collection, so if there is a duplicate, I will attempt to delete an already deleted pointer).我这样做是因为我不想要任何重复项(我删除了集合中的指针,因此如果有重复项,我将尝试删除已删除的指针)。 I loop heavily through these sets, and since I know std::vector is the fastest container for looping (contiguous memory), I wondered if std::unordered_set does the same.我大量循环这些集合,因为我知道 std::vector 是最快的循环容器(连续内存),我想知道 std::unordered_set 是否也这样做。

If it doesn't, would using a std::vector and checking if the pointer has already been deleted be faster?如果没有,使用 std::vector 并检查指针是否已被删除会更快吗?

Is std::unordered_set contiguous ? std::unordered_set连续的吗?

The exact implementation of containers is not detailed by the standard... however the standard does prescribes a number of behaviors which constrains the actual representation.标准没有详细说明容器的确切实现……但是标准确实规定了许多限制实际表示的行为。

For example, std::unordered_set is required to be memory stable: a reference to/address of an element is valid even when adding/removing other elements.例如, std::unordered_set需要内存稳定:即使在添加/删除其他元素时,对元素的引用/地址也是有效的。

The only way to achieve this is by allocating elements more or less independently.实现这一点的唯一方法是或多或少独立地分配元素。 It cannot be achieved with a contiguous memory allocation as such an allocation would necessarily be bounded, and thus could be overgrown with no possibility of re-allocating the elements in a bigger chunk.它不能通过连续的内存分配来实现,因为这样的分配必然是有界的,因此可能会过度增长,不可能在更大的块中重新分配元素。

No it is not contiguous memory, but it's still really fast, thanks to a hash map.不,它不是连续内存,但由于哈希映射,它仍然非常快。

Edit: fast for random access, if you mainly do loops, you should consider another container, I think.编辑:快速随机访问,如果你主要做循环,我认为你应该考虑另一个容器。

Edit2: And you should profile so as to know if it's worth thinking about another container. Edit2:您应该进行分析,以了解是否值得考虑另一个容器。 ( Maybe you should optimize somewhere else... maybe). (也许你应该在其他地方优化......也许)。

The fact that the following member functions are offered by std::unordered_map suggests that it is based on a hashed-table, perhaps separate chaining with linked lists .以下成员函数由std::unordered_map提供的事实表明它基于散列表,可能与链表分开链接

bucket_count, hash_function, load_factor, max_load_count, rehash

Whether the elements are contiguous or not depends on the allocator.元素是否连续取决于分配器。 The default allocator for the unordered_map and list does not allocate the elements in contiguous memory. unordered_maplist的默认分配器不会在连续内存中分配元素。 The memory for each element is allocated at the time of its insertion.每个元素的内存在其插入时分配。

However, you can provide a custom allocator (such as a pool allocator ) which may allocate the elements from a pre-allocated memory pool.但是,您可以提供自定义分配器(例如池分配器),它可以从预先分配的内存池中分配元素。 Still, the logically adjacent elements in the data structure may not be physically adjacent in the memory.尽管如此,数据结构中逻辑上相邻的元素在存储器中可能不是物理上相邻的。

So, if looping through all the elements is the most frequent operation, then the unordered_map may not be best solution.因此,如果遍历所有元素是最频繁的操作,那么unordered_map可能不是最佳解决方案。 Running the dominant use cases through a profiler for all competing solutions would reveal the best solution.通过分析器为所有竞争解决方案运行主要用例将揭示最佳解决方案。

In addition to that, unordered_map is not the best choice to loop for another reason.除此之外,出于另一个原因, unordered_map不是循环的最佳选择。 Note the word " unordered " in the name, it conveys that -- unlike list , vector , or map -- there is no order of the elements .请注意名称中的“无序”一词,它表示——与listvectormap ——元素没有顺序 For example, the member function rehash may change the relative order of the elements.例如,成员函数rehash可能会改变元素的相对顺序。 In fact, rehashes are automatically performed by the container whenever its load factor is going to exceed the max_load_factor during any operation.事实上,只要在任何操作期​​间其负载因子将超过max_load_factor ,容器就会自动执行重新max_load_factor

std::unordered_set is supposed to be a hash map container, so we could assume it has a little performance penalty when comparing with std::vector. std::unordered_set 应该是一个哈希映射容器,所以我们可以假设它与 std::vector 相比有一点性能损失。

But I think you must check out the actual profiling result if the unordered_set access is real hotspot.但是我认为如果 unordered_set 访问是真正的热点,您必须检查实际的分析结果。

If the STL implementation you are using is reasonable one, it should provide vector like specialization for the pointer or int type key.如果您使用的 STL 实现是合理的,它应该为指针或 int 类型键提供类似向量的特殊化。 If it's true, the unordered_set specialized for the pointer type will behave much like the automatically growing/shrinking vector and performance difference will be unnoticeable.如果为真,则专用于指针类型的 unordered_set 将表现得与自动增长/收缩向量非常相似,并且性能差异将不明显。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM