简体   繁体   English

在 c++ 中返回 std::vector 的有效方法

[英]Efficient way to return a std::vector in c++

How much data is copied, when returning a std::vector in a function and how big an optimization will it be to place the std::vector in free-store (on the heap) and return a pointer instead ie is:在 function 中返回std::vector时复制了多少数据,以及将 std::vector 放在自由存储区(在堆上)并返回指针而不是返回指针的优化程度,即:

std::vector *f()
{
  std::vector *result = new std::vector();
  /*
    Insert elements into result
  */
  return result;
} 

more efficient than:比:

std::vector f()
{
  std::vector result;
  /*
    Insert elements into result
  */
  return result;
} 

? ?

In C++11, this is the preferred way:在 C++11 中,这是首选方式:

std::vector<X> f();

That is, return by value.即按值返回。

With C++11, std::vector has move-semantics, which means the local vector declared in your function will be moved on return and in some cases even the move can be elided by the compiler.在 C++11 中, std::vector具有移动语义,这意味着函数中声明的局部向量将在返回时移动,在某些情况下,编译器甚至可以忽略移动。

You should return by value.您应该按值返回。

The standard has a specific feature to improve the efficiency of returning by value.该标准具有提高按值返回效率的特定功能。 It's called "copy elision", and more specifically in this case the "named return value optimization (NRVO)".它被称为“复制省略”,在这种情况下更具体地说是“命名返回值优化(NRVO)”。

Compilers don't have to implement it, but then again compilers don't have to implement function inlining (or perform any optimization at all).编译器没有实现它,但随后又编译器不必须实现内联函数(或执行任何优化)。 But the performance of the standard libraries can be pretty poor if compilers don't optimize, and all serious compilers implement inlining and NRVO (and other optimizations).但是,如果编译器不优化,标准库的性能可能会很差,并且所有严肃的编译器都实现了内联和 NRVO(以及其他优化)。

When NRVO is applied, there will be no copying in the following code:应用NRVO时,以下代码不会有复制:

std::vector<int> f() {
    std::vector<int> result;
    ... populate the vector ...
    return result;
}

std::vector<int> myvec = f();

But the user might want to do this:但用户可能想要这样做:

std::vector<int> myvec;
... some time later ...
myvec = f();

Copy elision does not prevent a copy here because it's an assignment rather than an initialization.复制省略不会阻止这里的复制,因为它是赋值而不是初始化。 However, you should still return by value.但是,您仍然应该按值返回。 In C++11, the assignment is optimized by something different, called "move semantics".在 C++11 中,赋值由不同的东西优化,称为“移动语义”。 In C++03, the above code does cause a copy, and although in theory an optimizer might be able to avoid it, in practice its too difficult.在 C++03 中,上面的代码确实会导致复制,虽然理论上优化器可以避免它,但实际上它太难了。 So instead of myvec = f() , in C++03 you should write this:因此,在 C++03 中,您应该这样写,而不是myvec = f()

std::vector<int> myvec;
... some time later ...
f().swap(myvec);

There is another option, which is to offer a more flexible interface to the user:还有另一种选择,即为用户提供更灵活的界面:

template <typename OutputIterator> void f(OutputIterator it) {
    ... write elements to the iterator like this ...
    *it++ = 0;
    *it++ = 1;
}

You can then also support the existing vector-based interface on top of that:然后,您还可以在此基础上支持现有的基于矢量的接口:

std::vector<int> f() {
    std::vector<int> result;
    f(std::back_inserter(result));
    return result;
}

This might be less efficient than your existing code, if your existing code uses reserve() in a way more complex than just a fixed amount up front.如果您现有的代码以比预先固定数量更复杂的方式使用reserve() ,这可能比您现有的代码效率低。 But if your existing code basically calls push_back on the vector repeatedly, then this template-based code ought to be as good.但是,如果您现有的代码基本上反复调用向量上的push_back ,那么这个基于模板的代码应该也一样好。

It's time I post an answer about RVO , me too...是时候发布关于RVO的答案了,我也是......

If you return an object by value, the compiler often optimizes this so it doesn't get constructed twice, since it's superfluous to construct it in the function as a temporary and then copy it.如果您按值返回一个对象,编译器通常会优化它,因此它不会被构造两次,因为在函数中构造它作为临时对象然后复制它是多余的。 This is called return value optimization: the created object will be moved instead of being copied.这称为返回值优化:创建的对象将被移动而不是被复制。

If the compiler supports Named Return Value Optimization ( http://msdn.microsoft.com/en-us/library/ms364057(v=vs.80).aspx ), you can directly return the vector provide that there is no:如果编译器支持命名返回值优化 ( http://msdn.microsoft.com/en-us/library/ms364057(v=vs.80).aspx ),则可以直接返回向量,前提是没有:

  1. Different paths returning different named objects不同的路径返回不同的命名对象
  2. Multiple return paths (even if the same named object is returned on all paths) with EH states introduced.引入了 EH 状态的多个返回路径(即使在所有路径上都返回了相同的命名对象)。
  3. The named object returned is referenced in an inline asm block.返回的命名对象在内联 asm 块中被引用。

NRVO optimizes out the redundant copy constructor and destructor calls and thus improves overall performance. NRVO 优化了冗余的复制构造函数和析构函数调用,从而提高了整体性能。

There should be no real diff in your example.在您的示例中应该没有真正的差异。

A common pre-C++11 idiom is to pass a reference to the object being filled.一个常见的 C++11 之前的习惯用法是传递对正在填充的对象的引用。

Then there is no copying of the vector.那么就没有向量的复制。

void f( std::vector & result )
{
  /*
    Insert elements into result
  */
} 
vector<string> getseq(char * db_file)

And if you want to print it on main() you should do it in a loop.如果你想在 main() 上打印它,你应该在循环中进行。

int main() {
     vector<string> str_vec = getseq(argv[1]);
     for(vector<string>::iterator it = str_vec.begin(); it != str_vec.end(); it++) {
         cout << *it << endl;
     }
}

Yes, return by value. 是的,按价值回报。 The compiler can handle it automatically. 编译器可以自动处理它。

follow code will works without copy constructors:以下代码将在没有复制构造函数的情况下工作:

your routine:你的日常:

std::vector<unsigned char> foo()
{
    std::vector<unsigned char> v;
    v.resize(16, 0);

    return std::move(v); // move the vector
}

After, You can use foo routine for get the vector without copy itself:之后,您可以使用 foo 例程获取向量而不复制自身:

std::vector<unsigned char>&& moved_v(foo()); // use move constructor

Result: moved_v size is 16 and it filled by [0]结果:moved_v 大小为 16,并由 [0] 填充

As nice as "return by value" might be, it's the kind of code that can lead one into error.就像“按值返回”一样好,它是一种可能导致错误的代码。 Consider the following program:考虑以下程序:

    #include <string>
    #include <vector>
    #include <iostream>
    using namespace std;
    static std::vector<std::string> strings;
    std::vector<std::string> vecFunc(void) { return strings; };
    int main(int argc, char * argv[]){
      // set up the vector of strings to hold however
      // many strings the user provides on the command line
      for(int idx=1; (idx<argc); ++idx){
         strings.push_back(argv[idx]);
      }

      // now, iterate the strings and print them using the vector function
      // as accessor
      for(std::vector<std::string>::interator idx=vecFunc().begin(); (idx!=vecFunc().end()); ++idx){
         cout << "Addr: " << idx->c_str() << std::endl;
         cout << "Val:  " << *idx << std::endl;
      }
    return 0;
    };
  • Q: What will happen when the above is executed?问:执行上述操作后会发生什么? A: A coredump.答:核心转储。
  • Q: Why didn't the compiler catch the mistake?问:为什么编译器没有发现错误? A: Because the program is syntactically, although not semantically, correct. A:因为该程序在语法上是正确的,尽管在语义上是正确的。
  • Q: What happens if you modify vecFunc() to return a reference?问:如果修改 vecFunc() 以返回引用会发生什么? A: The program runs to completion and produces the expected result. A:程序运行完成并产生预期的结果。
  • Q: What is the difference?问:有什么区别? A: The compiler does not have to create and manage anonymous objects.答:编译器不必创建和管理匿名对象。 The programmer has instructed the compiler to use exactly one object for the iterator and for endpoint determination, rather than two different objects as the broken example does.程序员已指示编译器为迭代器和端点确定使用一个对象,而不是像损坏的示例那样使用两个不同的对象。

The above erroneous program will indicate no errors even if one uses the GNU g++ reporting options -Wall -Wextra -Weffc++即使使用 GNU g++ 报告选项 -Wall -Wextra -Weffc++,上述错误程序也将指示没有错误

If you must produce a value, then the following would work in place of calling vecFunc() twice:如果你必须产生一个值,那么下面的方法可以代替调用 vecFunc() 两次:

   std::vector<std::string> lclvec(vecFunc());
   for(std::vector<std::string>::iterator idx=lclvec.begin(); (idx!=lclvec.end()); ++idx)...

The above also produces no anonymous objects during iteration of the loop, but requires a possible copy operation (which, as some note, might be optimized away under some circumstances. But the reference method guarantees that no copy will be produced. Believing the compiler will perform RVO is no substitute for trying to build the most efficient code you can. If you can moot the need for the compiler to do RVO, you are ahead of the game.以上在循环迭代期间也不会产生匿名对象,但需要一个可能的复制操作(正如某些人所说,在某些情况下可能会被优化掉。但引用方法保证不会产生任何副本。相信编译器会执行 RVO 不能替代尝试构建最有效的代码。如果您可以提出编译器执行 RVO 的需要,那么您就领先了。

   vector<string> func1() const
   {
      vector<string> parts;
      return vector<string>(parts.begin(),parts.end()) ;
   } 

This is still efficient after c++11 onwards as complier automatically uses move instead of making a copy.这在 c++11 之后仍然有效,因为编译器自动使用移动而不是复制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM