C++ String Concatenation Optimizations

Question

Looking at a piece of code like this (comments added):

std::string some_var;
std::string some_func(); // both are defined, but definition is irrelevant
...
return "some text " + some_var + "c" + some_func(); // intentionally "c" not 'c'

I was wondering, in which cases operator + of std::string has to make a copy (in the sense of using copy-construction/assignment, not the internal buffer being copied eg if SSO applies), and what actually gets copied . A quick look at cppreference was only partially helpful, as it lists 12(!) different cases. In part I am asking to confirm my understanding of the page:

Case 1) makes a copy of lhs then copies rhs to end of this copy
In C++98 Case 2) - 5) a temporary string is constructed from the char/const char* argument, which then results in case 1)
In C++11 Case 2) - 5) a temporary string is constructed from the char/const char* argument, which then results in case 6) or 7)
In C++11 Case 6) - 12) the r-value argument will be mutated with insert/append and, if a char/const char* argument was provided, no temporary is necessary due to the overloads on insert/append . In all cases an r-value is returned to facilitate further chaining. No copies are made (except the copy of the arguments to be appended/inserted at the insertion location). The contents of the string may need to be moved.

A chain like the example above should thus result in: 2) -> 6) -> 11) -> 8), with no copies of any lhs being made, but just modifications to the buffer of the r-value resulting from the first operation (creation of the temp-string).

Therefore this seems to be as efficient as operator += , once operator + uses at least on r-value argument. Is this correct, and is there any point in using operator += over operator + in C++11 and after anymore, unless the both arguments are l-value strings?

What optimizations can the compiler make in addition?

Edit: clarify intent of the question. Initial part is about the specifics of the language only (implementation non-withstanding); the last question is about additional optimizations.

Answer 1

A string is a rather opaque object: it holds an internal char buffer and manages it the way it wants. Adding a single character to a string may end in allocation of a new buffer, copy of the initial part and copy of the added part. All depends whether the allocated buffer is large enough to accept to added part.

The quotation says:

... No copies are made (except the copy of the arguments to be appended/inserted at the insertion location). The contents of the string may need to be moved .

Said differently a new allocation, a full copy and deallocation of the old buffer...

And when you speak of efficiency and optimization, you must remember that the compiler has not to follow the way you have written the program. Because of the as-if rule, it can optimize the way it want, provided the observable behaviour is respected. C++ standard says:

1.9 Program execution [intro.execution]
...
5 A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input.

A note explains even that:

an implementation is free to disregard any requirement of this International Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program.

So it is likely that a = a + b; and a += b; are compiled in exactly the same code.

You should never worry for low level optimizations when you write a C++ program: the compiler will care for it and it is commonly said that compiler is smarter than you . Only go that way when you have identified a real bottleneck, and be aware that low level optimization if only for one compiler on one architecture and one configuration.

C++ String Concatenation Optimizations

Question

1 answers

solution1
0 2017-01-31 12:43:11

C++ String Concatenation Optimizations

Question

1 answers

solution1 0 2017-01-31 12:43:11

solution1
0 2017-01-31 12:43:11