编译器通过引用与按值优化返回

Question

I wanted to understand the specific optimization if any which can be performed by the compiler here.我想了解编译器可以在此处执行的具体优化。

The two get functions get() return a string by reference and another by value.两个 get 函数 get() 通过引用返回一个字符串，另一个通过值返回。 Map is global in nature. Map 本质上是全球性的。

I want to understand this as it common thing is done, so what will happen in case we have a map of objects or map of maps or very big strings, (not limiting my map to strings only ).我想理解这一点，因为这是常见的事情，所以如果我们有对象的 map 或地图的 map 或非常大的字符串会发生什么，（不将我的 map 仅限于字符串）。 How costly it can get.它可以得到多么昂贵。 I can understand most of us know that it's compiler-dependent, but can we have a list of unknowns.我可以理解我们大多数人都知道它依赖于编译器，但是我们可以有一个未知数列表。 It will really help这真的会有所帮助

std::map<std::string, std::string> value;

std::string& get(std::string& key) {
    return value[key];
}

std::string get2(std::string& key) {
    return value[key];
}

int main()
{ 
    value.insert(std::make_pair("name","XXXXXXX"));
    std::string keyaa = "name";
    auto new_val = get(keyaa);
    auto new_val2 = get2(keyaa);
}

Answer 1

From the C++ language perspective there is no guarantee that the copy in get2 will be elided.从 C++ 语言的角度来看，不能保证get2中的副本会被忽略。 Mandatory return value optimization covers only prvalue operands (ie values created in the function call itself).强制返回值优化仅涵盖纯右值操作数（即在function调用自身中创建的值）。 Even the "permitted" optimization doesn't cover pre-existing objects.即使是“允许的”优化也不涵盖预先存在的对象。

So we can only hope that compilers today are smart enough to optimize away the copy, which means we have to test it!所以我们只能希望今天的编译器足够聪明，可以优化掉副本，这意味着我们必须对其进行测试！

I've rewritten the example slightly to make it maximally easy for the compiler to optimize away the string:我稍微重写了示例，以使编译器最容易优化字符串：

#include <string>
#include <map>

struct Test {
    std::map<std::string, std::string> value = {{"name", "XXXXXX"}};

    std::string const& get(std::string const& key) {
        return value[key];
    }

    std::string get2(std::string const& key) {
        return value[key];
    }
};

static void TestReturnByReference(benchmark::State& state) {
  Test test;
  std::string key = "name";
  for (auto _ : state) {
    size_t n = test.get(key).size();
    benchmark::DoNotOptimize(n);
  }
}

BENCHMARK(TestReturnByReference);

static void TestReturnByValue(benchmark::State& state) {
  Test test;
  std::string key = "name";
  for (auto _ : state) {
    size_t n = test.get2(key).size();
    benchmark::DoNotOptimize(n);
  }
}

BENCHMARK(TestReturnByValue);

And no, as it turns out nether GCC nor Clang are able to optimize it away entirely:不，事实证明，无论 GCC 和 Clang 都能够完全优化它：

GCC 10.2, -O3: ( link to quick-bench ) - noticeable difference: GCC 10.2，-O3：（链接到快速工作台）-明显的区别：

在此处输入图像描述

Clang 11 (libc++), -O3: - better, but still slower: Clang 11 (libc++), -O3: - 更好，但仍然更慢：

在此处输入图像描述

Conclusion: returning an existing string is faster by reference.结论：通过引用返回现有字符串更快。

Note: starting from C++17, you can return std::string_view to avoid worrying about this:注意：从 C++17 开始，您可以返回std::string_view以避免担心这个：

std::string_view get(std::string const& key) {
    return value[key];
}

编译器通过引用与按值优化返回

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-03-24 20:49:08

编译器通过引用与按值优化返回

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-03-24 20:49:08

解决方案1
1 已采纳 2021-03-24 20:49:08