简体   繁体   English

收集对象比std :: vector更快

[英]Collecting objects faster than std::vector

I have a function that stores a lot of small objects (~16 bytes) in a vector, but it doesn't know in advance how many objects will be stored (imagine a recursive descent parser storing tokens for example). 我有一个函数,可以在一个向量中存储很多小对象(〜16个字节),但是它事先并不知道将存储多少个对象(例如,递归下降解析器存储令牌)。

std::vector<SmallObject> getObjects();

This is quite slow because of all the reallocation and copying (and apparently C++ even has to invoke the copy constructors if you don't use an optimised version (see "Object Relocation"). 由于所有重新分配和复制,这非常慢(而且, 如果您不使用优化版本,则 C ++甚至必须调用复制构造函数(请参见“对象重定位”))。

There must be a better way to do things like this where all I am doing to construct the vector is appending things. 必须有一种更好的方法来做这样的事情,其中​​我正在构造向量的所有工作都是附加事物。 For example I could have a singly linked list of blocks that are filled, and convert everything to a single vector at the end, so everything only has to be copied once. 例如,我可能有一个填充的块的单链接列表,最后将所有内容转换为单个向量,因此所有内容只需复制一次。

Is there anything in Boost or the standard C++ library that would help with this? Boost或标准C ++库中是否有什么可以帮助您解决此问题? Or any particularly clever algorithms? 还是任何特别聪明的算法?

Edit: To be more concrete: 编辑:更具体:

struct SmallObject {
    unsigned id;
    boost::icl::discrete_interval<unsigned> ival;
};

The question which container is most efficient is always best answered by "it depends" and "measure it!". 哪个容器最有效的问题始终可以通过“取决于”和“对其进行测量!”得到最好的回答。

Without any more information about your specific situation, there are two 'obvious' possibilities: 如果没有有关您的具体情况的更多信息,则有两种“显而易见的”可能性:

Use a linked list 使用链表

The STL has two linked lists by default: a singly linked list std::forward_list and a doubly linked list std::deque . STL默认情况下具有两个链接列表:单链接列表std::forward_list和双链接列表std::deque Moreover there is std::list which is usually the doubly-linked variant. 此外,还有std::list ,通常是双向链接的变体。 Some quotes from the documentation: 文档中的一些引号:

std::forward [...] is implemented as a singly-linked list and essentially does not have any overhead compared to its implementation in C. Compared to std::list this container provides more space efficient storage when bidirectional iteration is not needed. std::forward [...]被实现为单链接列表,与其在C中的实现相比,基本上没有任何开销。与std :: list相比,此容器在不需要双向迭代时提供了更节省空间的存储。

std::list [...] is usually implemented as a doubly-linked list. std::list [...]通常实现为双向链接列表。 Compared to std::forward_list this container provides bidirectional iteration capability while being less space efficient. 与std :: forward_list相比,此容器提供了双向迭代功能,但空间效率较低。

std::deque (double-ended queue) [..] insertion and deletion at either end of a deque never invalidates pointers or references to the rest of the elements. std::deque (双端队列)[..]在双端队列的任一端插入和删除都不会使指向其余元素的指针或引用无效。

As opposed to std::vector, the elements of a deque are not stored contiguously: typical implementations use a sequence of individually allocated fixed-size arrays 与std :: vector相反,双端队列的元素不是连续存储的:典型的实现使用一系列单独分配的固定大小的数组

Reserve space in a vector 在向量中保留空间

If there is any way you can estimate an upper bound on the number of objects you will want to store, you can use that to reserve some space in advance. 如果有任何方法可以估算要存储的对象数的上限,则可以使用该方法提前reserve一些空间。

For example, if you're reading these objects from a file, the number of objects may be at most the file size divided by 16, or the number of lines times two, or some other quick and easy calculation that you can do before constructing these objects. 例如,如果您正在从文件中读取这些对象,则对象的数量最多为文件大小除以16,或者为行数乘以2,或者是在构造之前可以进行的其他一些快速简便的计算这些对象。

In that case, if you reserve the capacity, you will allocate too much memory but prevent moves. 在这种情况下,如果reserve容量,则会分配过多的内存,但会阻止移动。 Even if the upper bound is a bit too low, that's OK: you may still need to double the capacity once or twice but at least you prevent all the small increases (2 -> 4 -> 10 -> 16) at the start of the loop. 即使上限有点太低,也没关系:您可能仍然需要将容量增加一倍或两倍,但至少要防止在开始时所有小的增加(2-> 4-> 10-> 16)。循环。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM