[英]unordered_set range insertion VS iterator
I am trying to understand why range insertion below is faster than using the iterator.我试图理解为什么下面的范围插入比使用迭代器更快。
vector<string> &paths // 3 milion strings
Method 1 : range insert方法一:范围插入
unordered_set<string> mySet;
mySet.insert(paths.begin(), paths.end());
Method 2 : iterator方法二:迭代器
vector<string>::iterator row;
for (row = paths.begin(); row != paths.end(); row++)
{
mySet.insert(row[0]);
}
Results :结果 :
Method 1 : 753 ms方法 1:753 毫秒
Method 2 : 1221 ms方法 2:1221 毫秒
============================== ==============================
OS: Windows 10操作系统:Windows 10
IDE: visual studio code IDE:视觉工作室代码
Compiler: gcc version 8.1.0编译器:gcc 8.1.0 版
Flags : -O3标志:-O3
Intuitively, the range insertion procedure should be faster.直观地,范围插入过程应该更快。 Imagine, for example, that you want to insert a million elements.
例如,假设您要插入一百万个元素。 If you do a range insert, the set can
如果您进行范围插入,则该集合可以
There are some further possible optimizations that could be done here (using a pooled allocator for bulk allocations, doing a multithreaded insertion procedure, etc.), though I'm not sure whether these are actually done.这里还有一些可能的优化(使用池分配器进行批量分配、执行多线程插入过程等),但我不确定这些是否真的完成了。
On the other hand, if you insert things one at a time, each of these steps needs to be done a million times.另一方面,如果您一次插入一个东西,那么这些步骤中的每一个都需要执行一百万次。 That means there's time and space wasted allocating intermediate arrays of buckets that don't ultimately get used, but which the implementation can't tell won't be used because the implementation has to keep things in a good state every step of the way.
这意味着有时间和空间浪费在分配最终不会被使用但实现无法确定不会被使用的存储桶的中间数组上,因为实现必须在每一步都保持良好状态。
For an unordered_set
these optimizations are just improvements to the expected O(1) cost per insertion.对于
unordered_set
这些优化只是对每次插入的预期 O(1) 成本的改进。 In some other containers like vector
or deque
, bulk inserts can be asymptotically faster than repeated individual inserts because the container can move other elements once during the bulk insert rather than doing lots of repeated shifts.在其他一些容器中,例如
vector
或deque
,批量插入比重复的单个插入要快得多,因为容器可以在批量插入期间移动其他元素一次,而不是进行大量重复移位。
Hope this helps!希望这可以帮助!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.