[英]Iteration speed and element size
I have a std::vector
filled with following structures: 我有一个std::vector
填充了以下结构:
#define ELEMENTSIZE 8
struct Element {
int value;
char size[ELEMENTSIZE - 4]; //1 char is 1B in size - 4B int
};
The size of structure depends on defined elementsize, which makes an array of chars of specified size in the structure. 结构的大小取决于已定义的元素大小,这会在结构中生成指定大小的字符数组。
I am benchmarking an average value of these structures in vector and I would love to know the reason why vector filled with bigger structures in size takes longer to iterate over. 我在向量中对这些结构的平均值进行基准测试,我很想知道填充大尺寸结构的向量需要更长时间迭代的原因。
For example vector with 1 000 000 8B structures takes roughly 1,7ms and the same test with 128B structures 12,7ms. 例如,具有1 000 000个8B结构的矢量大约花费1,7ms,并且相同的测试具有128B结构12,7ms。
Is that big difference because of cache only? 这是因为缓存只有很大的区别吗? If so, could you explain why? 如果是这样,你能解释一下原因吗? Or is there any other aspect that I can not see? 或者还有其他方面我看不到?
The structure is 16 times bigger, so it should take 16 times longer to iterate through. 结构大16倍,因此迭代需要花费16倍的时间。 Mathematically 12,7/1,7 = 7,47 times more, so it almost matches up mathematically. 数学上12,7 / 1,7 = 7,47倍,因此几乎在数学上匹配。
Now imagine the structure containing the 128B elements was a structure containing 8B elements, but the same size. 现在想象一下,包含128B元素的结构是一个包含8B元素的结构,但大小相同。 Do you see now that it really is 16 times larger? 你现在看到它真的大16倍吗?
The OS must bring the larger structures in memory, which may take this path: 操作系统必须将更大的结构带入内存,这可能需要这条路径:
iterator
object being used. 在L1或处理器级别,必须在正在使用的iterator
对象周围复制内容。 It largely depends on cache performance. 它主要取决于缓存性能。 If all of this is happening, why would 128
structure not take more time than a 8
byte structure? 如果所有这一切都发生了,为什么128
结构不会花费比8
字节结构更多的时间?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.