简体   繁体   English

按值传递与按引用或按指针传递的性能成本?

[英]Performance cost of passing by value vs. by reference or by pointer?

Let's consider an object foo (which may be an int , a double , a custom struct , a class , whatever).让我们考虑一个对象foo (它可能是一个int 、一个double 、一个自定义struct 、一个class ,等等)。 My understanding is that passing foo by reference to a function (or just passing a pointer to foo ) leads to higher performance since we avoid making a local copy (which could be expensive if foo is large).我的理解是,经过foo参考函数(或只是一个指针传递给foo )带来更高的性能,因为我们避免做出一个本地副本(这可能是昂贵的,如果foo大)。

However, from the answer here it seems that pointers on a 64-bit system can be expected in practice to have a size of 8 bytes, regardless of what's being pointed.但是,从这里的答案看来,实际上可以预期 64 位系统上的指针的大小为 8 个字节,而不管指向的是什么。 On my system, a float is 4 bytes.在我的系统上, float是 4 个字节。 Does that mean that if foo is of type float , then it is more efficient to just pass foo by value rather than give a pointer to it (assuming no other constraints that would make using one more efficient than the other inside the function)?这是否意味着,如果foo类型为float ,那么它是更有效的只是传递foo的值,而不是给它的指针(假定没有其他限制,这将使使用一个比函数内部的其他更有效)?

It depends on what you mean by "cost", and properties of the host system (hardware, operating system) with respect to operations.这取决于您所说的“成本”是什么意思,以及主机系统(硬件、操作系统)在操作方面的属性。

If your cost measure is memory usage, then the calculation of cost is obvious - add up the sizes of whatever is being copied.如果您的成本度量是内存使用,那么成本的计算是显而易见的 - 将复制的大小相加。

If your measure is execution speed (or "efficiency") then the game is different.如果您的衡量标准是执行速度(或“效率”),那么游戏就不同了。 Hardware (and operating systems and compiler) tend to be optimised for performance of operations on copying things of particular sizes, by virtue of dedicated circuits (machine registers, and how they are used).硬件(以及操作系统和编译器)倾向于通过专用电路(机器寄存器及其使用方式)针对复制特定大小的事物的操作性能进行优化。

It is common, for example, for a machine to have an architecture (machine registers, memory architecture, etc) which result in a "sweet spot" - copying variables of some size is most "efficient", but copying larger OR SMALLER variables is less so.例如,机器具有导致“最佳位置”的架构(机器寄存器、内存架构等)是很常见的 - 复制某种大小的变量最“有效”,但复制更大或更小的变量是不那么。 Larger variables will cost more to copy, because there may be a need to do multiple copies of smaller chunks.较大的变量复制成本更高,因为可能需要对较小的块进行多次复制。 Smaller ones may also cost more, because the compiler needs to copy the smaller value into a larger variable (or register), do the operations on it, then copy the value back.较小的也可能花费更多,因为编译器需要将较小的值复制到较大的变量(或寄存器)中,对其进行操作,然后将值复制回来。

Examples with floating point include some cray supercomputers, which natively support double precision floating point (aka double in C++), and all operations on single precision (aka float in C++) are emulated in software.浮点的例子包括一些 cray 超级计算机,它们本机支持双精度浮点(在 C++ 中也称为double ),并且在单精度(在 C++ 中也称为float )上的所有操作都在软件中模拟。 Some older 32-bit x86 CPUs also worked internally with 32-bit integers, and operations on 16-bit integers required more clock cycles due to translation to/from 32-bit (this is not true with more modern 32-bit or 64-bit x86 processors, as they allow copying 16-bit integers to/from 32-bit registers, and operating on them, with fewer such penalties).一些较旧的 32 位 x86 CPU 也在内部使用 32 位整数,并且由于与 32 位之间的转换,对 16 位整数的操作需要更多的时钟周期(对于更现代的 32 位或 64 位,情况并非如此)位 x86 处理器,因为它们允许将 16 位整数复制到 32 位寄存器或从 32 位寄存器复制,并对其进行操作,这样的惩罚较少)。

It is a bit of a no-brainer that copying a very large structure by value will be less efficient than creating and copying its address.按值复制一个非常大的结构比创建和复制它的地址效率低,这有点显而易见。 But, because of factors like the above, the cross-over point between "best to copy something of that size by value" and "best to pass its address" is less clear.但是,由于上述因素,“最好按值复制该大小的内容”和“最好传递其地址”之间的交叉点不太清楚。

Pointers and references tend to be implemented in a similar manner (eg pass by reference can be implemented in the same way as passing a pointer) but that is not guaranteed.指针和引用往往以类似的方式实现(例如,通过引用传递可以以与传递指针相同的方式实现),但这并不能保证。

The only way to be sure is to measure it.唯一确定的方法是测量它。 And realise that the measurements will vary between systems.并意识到测量值会因系统而异。

There is one thing nobody mentioned.有一件事没有人提到。

There is a certain GCC optimization called IPA SRA, that replaces "pass by reference" with "pass by value" automatically: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html (-fipa-sra)有一种称为 IPA SRA 的 GCC 优化,它自动将“通过引用”替换为“通过值”: https ://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html (-fipa-sra)

This is most likely done for scalar types (eg. int, double, etc), that does not have non-default copy semantics and can fit into cpu registers.这很可能是为标量类型(例如 int、double 等)完成的,它没有非默认复制语义并且可以放入 cpu 寄存器。

This makes这使得

void(const int &f)

probably as fast (and space optimized)可能一样快(和空间优化)

void(int f)

So with this optimization enabled, using references for small types should be as fast as passing them by value.因此,启用此优化后,对小类型使用引用应该与按值传递它们一样快。

On the other hand passing (for example) std::string by value could not be optimized to by-reference speed, as custom copy semantics are being involved.另一方面,由于涉及自定义复制语义,因此无法将按值传递(例如)std::string 优化为按引用速度。

From what I understand, using pass by reference for everything should never be slower than manually picking what to pass by value and what to pass by reference.据我了解,对所有内容使用按引用传递永远不会比手动选择按值传递的内容和按引用传递的内容慢。

This is extremely useful especially for templates:这对于模板特别有用:

template<class T>
void f(const T&)
{
    // Something
}

is always optimal总是最优的

You must test any given scenario where performance is absolutely critical, but be very careful about trying to force the compiler to generate code in a specific way.您必须测试性能绝对至关重要的任何给定场景,但在尝试强制编译器以特定方式生成代码时要非常小心。

The compiler's optimizer is allowed to re-write your code in any way it chooses as long as the final result is the provably same, which can lead to some very nice optimizations.允许编译器的优化器以它选择的任何方式重写您的代码,只要最终结果可证明是相同的,这可以导致一些非常好的优化。

Consider that passing a float by value requires making a copy of the float, but under the right conditions, passing a float by reference could allow storing the original float in a CPU floating-point register, and treat that register as the "reference" parameter to the function.考虑到按值传递浮点数需要制作浮点数的副本,但在正确的条件下,通过引用传递浮点数可以允许将原始浮点数存储在 CPU 浮点寄存器中,并将该寄存器视为“引用”参数到函数。 By contrast, if you pass a copy, the compiler has to find a place to store the copy in order to preserve the contents of the register, or even worse, it may not be able to use a register at all because of the need for preserving the original (this is especially true in recursive functions!).相比之下,如果传递一个副本,编译器为了保存寄存器的内容,就得找个地方存放副本,甚至更糟的是,它可能因为需要一个寄存器而根本无法使用保留原始值(在递归函数中尤其如此!)。

This difference is also important if you are passing the reference to a function that could be inlined, where the reference may reduce the cost of inlining since the compiler doesn't have to guarantee that a copied parameter cannot modify the original.如果您将引用传递给可以内联的函数,这种差异也很重要,其中引用可能会降低内联的成本,因为编译器不必保证复制的参数不能修改原始参数。

The more a language allows you to focus on describing what you want done rather than how you want it done, the more the compiler is able to find creative ways of doing the hard work for you.一种语言越是让你专注于描述你想要完成的事情而不是你想要如何完成,编译器就越能够找到创造性的方法来为你完成艰苦的工作。 In C++ especially, it is generally best not to worry about performance, and instead focus on describing what you want as clearly and simply as possible.特别是在 C++ 中,通常最好不要担心性能,而是专注于尽可能清楚和简单地描述你想要的东西。 By trying to describe how you want the work done, you will just as often prevent the compiler from doing its job of optimizing your code for you.通过尝试描述您希望如何完成工作,您将经常阻止编译器为您优化代码。

Does that mean that if foo is of type float, then it is more efficient to just pass foo by value?这是否意味着如果 foo 是 float 类型,那么按值传递 foo 会更有效?

Passing a float by value could be more efficient.按值传递浮点数可能更有效。 I would expect it to be more efficient - partly because of what you said: A float is smaller than a pointer on a system that you describe.我希望它更有效 - 部分原因是你所说的:浮点数比你描述的系统上的指针小。 But in addition, when you copy the pointer, you still need to dereference the pointer to get the value within the function.但除此之外,在复制指针时,仍然需要对指针进行解引用,以获取函数内的值。 The indirection added by the pointer could have a significant effect on the performance.指针添加的间接性可能会对性能产生重大影响。

The efficiency difference could be negligible.效率差异可以忽略不计。 In particular, if the function can be inlined and optimization is enabled, there is likely not going to be any difference.特别是,如果可以内联函数并启用优化,则可能不会有任何区别。

You can find out if there is any performance gain from passing the float by value in your case by measuring.您可以通过测量找出在您的情况下按值传递浮点数是否有任何性能提升。 You can measure the efficiency with a profiling tool.您可以使用分析工具来衡量效率。

You may substitute pointer with reference and the answer will still apply equally well.您可以用参考替换指针,答案仍然同样适用。

Is there some sort of overhead in using a reference, the way that there is when a pointer must be dereferenced?使用引用是否存在某种开销,即必须取消引用指针的方式?

Yes.是的。 It is likely that a reference has exactly the same performance characteristics as a pointer does.引用很可能与指针具有完全相同的性能特征。 If it is possible to write a semantically equivalent program using either references or pointers, both are probably going to generate identical assembly.如果可以使用引用或指针编写语义等效的程序,则两者都可能生成相同的程序集。


If passing a small object by pointer would be faster than copying it, then surely it would be true for an object of same size, wouldn't you agree?如果通过指针传递一个小对象比复制它快,那么对于相同大小的对象肯定是这样,你不同意吗? How about a pointer to a pointer, that's about the size of a pointer, right?指向指针的指针怎么样,大约是指针的大小,对吧? (It's exactly the same size.) Oh, but pointers are objects too. (大小完全相同。)哦,但指针也是对象。 So, if passing an object (such as a pointer) by pointer is faster than copying the object (the pointer), then passing a pointer to a pointer to a pointer to a pointer ... to a pointer would be faster than the progarm with less pointers that's still faster than the one that didn't use pointers... Perhap's we've found an infinite source of efficiency here :)因此,如果通过指针传递对象(例如指针)比复制对象(指针)快,那么将指向指针的指针传递给指向指针的指针......传递给指针的速度将比程序快使用更少的指针仍然比不使用指针的指针更快......也许我们在这里找到了无限的效率来源:)

Always prioritize pass by reference than pointers if you want an optimized execution time to avoid random access.如果您想要优化执行时间以避免随机访问,请始终优先考虑通过引用而不是指针。 For pass by references vs by value, the GCC optimize your code such that small variable that do not need to be changed will be passed by value.对于按引用传递与按值传递,GCC 优化您的代码,以便不需要更改的小变量将按值传递。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM