简体   繁体   中英

When to prefer const lvalue reference over rvalue reference templates

Currently reading the codebase for cpr requests library: https://github.com/whoshuu/cpr/blob/master/include/cpr/api.h

Noticed that the interface for this library uses perfect forwarding quite often. Just learning rvalue references so this is all relatively new to me.

From my understanding, the benefit with rvalue references, templating, and forwarding is that the function call being wrapped around will take its arguments by rvalue reference rather than by value. Which avoids unnecessary copying. It also prevents one from having to generate a bunch of overloads due to reference deduction.

However, from my understanding, const lvalue reference essentially does the same thing. It prevents the need for overloads and passes everything by reference. With the caveat that if the function being wrapped around takes a non-const reference, it won't compile.

However if everything within the call stack won't need a non-const reference, then why not just pass everything by const lvalue reference?

I guess my main question here is, when should you use one over the other for best performance? Attempted to test this with the below code. Got the following relatively consistent results:

Compiler: gcc 6.3 OS: Debian GNU/Linux 9

<<<<
Passing rvalue!
const l value: 2912214
rvalue forwarding: 2082953
Passing lvalue!
const l value: 1219173
rvalue forwarding: 1585913
>>>>

These results stay fairly consistent between runs. It appears that for an rvalue arg, the const l value signature is slightly slower, though I'm not exactly sure why, unless I'm misunderstanding this and const lvalue reference does in fact make a copy of the rvalue.

For lvalue arg, we see the counter, rvalue forwarding is slower. Why would this be? Shouldn't the reference deduction always produce a reference to an lvalue? If thats the case shouldn't it be more or less equivalent to the const lvalue reference in terms of performance?

#include <iostream>
#include <string>
#include <utility>
#include <time.h>

std::string func1(const std::string& arg) {
    std::string test(arg);
    return test;
}

template <typename T>
std::string func2(T&& arg) {
    std::string test(std::forward<T>(arg));
    return test;
}

void wrap1(const std::string& arg) {
    func1(arg);
}

template <typename T>
void wrap2(T&& arg) {
    func2(std::forward<T>(arg));
}

int main()
{
     auto n = 100000000;

     /// Passing rvalue
     std::cout << "Passing rvalue!" << std::endl;

     // Test const l value
     auto t = clock();
     for (int i = 0; i < n; ++i)
         wrap1("test");
     std::cout << "const l value: " << clock() - t << std::endl;

     // Test rvalue forwarding
     t = clock();
     for (int i = 0; i < n; ++i)
         wrap2("test");
     std::cout << "rvalue forwarding: " <<  clock() - t << std::endl;

     std::cout << "Passing lvalue!" << std::endl;

     /// Passing lvalue
     std::string arg = "test";

     // Test const l value
     t = clock();
     for (int i = 0; i < n; ++i)
         wrap1(arg);
     std::cout << "const l value: " << clock() - t << std::endl;

     // Test rvalue forwarding
     t = clock();
     for (int i = 0; i < n; ++i)
         wrap2(arg);
     std::cout << "rvalue forwarding: " << clock() - t << std::endl;

}

First of all, here are slightly different results from your code. As mentioned in comments, compiler and its settings are very important. In particular, you may notice that all cases have similar runtime, except for the first one, which is about twice as slow.

Passing rvalue!
const l value: 1357465
rvalue forwarding: 669589
Passing lvalue!
const l value: 744105
rvalue forwarding: 713189

Let's look at exactly what happens in each case.

1) When calling wrap1("test") , since signature of that function expects a const std::string & , the char array you are passing will be implicitly converted to a temporary std::string object on every call (ie n times), which involves a copy* of the value. A const reference to that temporary will then be passed into func1 , where another std::string is constructed from it, which again involves a copy (since it's a const reference, it cannot be moved from, despite being in fact a temporary). Even though the function returns by value, due to RVO that copy would be guaranteed to be elided if the return value was used. In this case the return value is not used, and I'm not entirely sure whether the standard allows the compiler to optimize away the construction of temp . I suspect not, since in general such construction could have observable side effects (and your results suggest it does not get optimized away). To sum up, a full-on construction and destruction of std::string is performed twice in this case.

2) When calling wrap2("test") , the argument type is const char[5] , and it gets forwarded as an rvalue reference all the way to func2 , where an std::string constructor from a const char[] is called that copies the value. The deduced type of template parameter T is const char[5] && and, quite obviously, it cannot be moved from despite being an rvalue reference (due to both being const and not being an std::string ). Compared to the previous case, construction/destruction of a string only happens once per call (the const char[5] literal is always in memory and incurs no overhead).

3) When calling wrap1(arg) , you are passing an lvalue as a const string & through the chain, and one copy constructor is called in func1 .

4) When calling wrap2(arg) , this is similar to the previous case, since the deduced type for T is const std::string & .

5) I'm assuming your test was designed to demonstrate the advantage of perfect forwarding when a copy of the argument needs to be made at the bottom of the call chain (hence the creation of temp ). In this case, you need to replace the "test" argument in first two cases with std::string("test") in order to truly have an std::string && argument, and also fix your perfect forwarding to be std::forward<T>(arg) , as mentioned in comments. In that case, the results are:

Passing rvalue!
const l value: 1314630
rvalue forwarding: 595084
Passing lvalue!
const l value: 712461
rvalue forwarding: 720338

which is similar to what we had before, but now actually invoking a move constructor.

I hope this helps explain the results. There may be some other issues related to inlining of function calls and other compiler optimizations, which would help explain the smaller discrepancies between cases 2-4.

As to your question which approach to use, I suggest reading Scott Meyer's "Effective Modern C++" items 23-30. Apologies for a book reference instead of a direct answer, but there is no silver bullet, and the optimal choice is always case-dependent, so it's better to just understand the trade-offs of each design decision.


* A copy constructor may or may not involve dynamic memory allocation due to Short String Optimization; thanks to ytoledano for bringing this up in the comments. Also, I've implicitly assumed throughout the answer that a copy is significantly more expensive that a move, which is not always the case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM