简体   繁体   中英

Is it safe to append std::string to itself?

Considering a code like this:

std::string str = "abcdef";
const size_t num = 50;

const size_t baselen = str.length();
while (str.length() < num)
    str.append(str, 0, baselen);

Is it safe to call std::basic_string<T>::append() on itself like this? Cannot the source memory get invalidated by enlarging before the copy operation?

I could not find anything in the standard specific to that method. It says the above is equivalent to str.append(str.data(), baselen) , which I think might not be entirely safe unless there is another detection of such cases inside append(const char*, size_t) .

I checked a few implementations and they seemed safe one way or another, but my question is if this behavior is guaranteed. Eg " Appending std::vector to itself, undefined behavior? " says it's not for std::vector .

According to §21.4.6.2/§21.4.6.3:

The function [ basic_string& append(const charT* s, size_type n); ] replaces the string controlled by *this with a string of length size() + n whose first size() elements are a copy of the original string controlled by *this and whose remaining elements are a copy of the initial n elements of s.

Note: This applies to every append call, as every append can be implemented in terms of append(const charT*, size_type) , as defined by the standard (§21.4.6.2/§21.4.6.3).

So basically, append makes a copy of str (let's call the copy strtemp ), appends n characters of str2 to strtemp , and then replaces str with strtemp .

For the case that str2 is str , nothing changes, as the string is enlarged when the temporary copy is assigned, not before.

Even though it is not explicitly stated in the standard, it is guaranteed (if the implementation is exactly as stated in the standard) by the definition of std::basic_string<T>::append .

Thus, this is not undefined behavior.

This is complicated.

One thing that can be said for certain. If you use iterators:

std::string str = "abcdef";
str.append(str.begin(), str.end());

then you are guaranteed to be safe. Yes, really. Why? Because the specification states that the behavior of the iterator functions is equivalent to calling append(basic_string(first, last)) . That obviously creates a temporary copy of the string. So if you need to insert a string into itself, you're guaranteed to be able to do it with the iterator form.

Granted, implementations don't have to actually copy it. But they do need to respect the standard specified behavior. An implementation could choose to make a copy only if the iterator range is inside of itself, but the implementation would still have to check.

All of the other forms of append are defined to be equivalent to calling append(const charT *s, size_t len) . That is, your call to append above is equivalent to you doing append(str.data(), str.size()) . So what does the standard say about what happens if s is inside of *this ?

Nothing at all.

The only requirement of s is:

s points to an array of at least n elements of charT .

Since it does not expressly forbid s pointing into *this , then it must be allowed. It would also be exceedingly strange if the iterator version allows self-assignment, but the pointer&size version did not.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM