简体   繁体   English

将std :: string附加到自身是否安全?

[英]Is it safe to append std::string to itself?

Considering a code like this: 考虑这样的代码:

std::string str = "abcdef";
const size_t num = 50;

const size_t baselen = str.length();
while (str.length() < num)
    str.append(str, 0, baselen);

Is it safe to call std::basic_string<T>::append() on itself like this? 像这样调用std::basic_string<T>::append()是否安全? Cannot the source memory get invalidated by enlarging before the copy operation? 通过在复制操作之前放大,源内存是否无法生效?

I could not find anything in the standard specific to that method. 我找不到该方法特有的标准中的任何内容。 It says the above is equivalent to str.append(str.data(), baselen) , which I think might not be entirely safe unless there is another detection of such cases inside append(const char*, size_t) . 它说上面等同于str.append(str.data(), baselen) ,我认为可能不完全安全,除非在append(const char*, size_t)有另外检测到这种情况。

I checked a few implementations and they seemed safe one way or another, but my question is if this behavior is guaranteed. 我检查了一些实现,他们似乎这样或那样安全,但我的问题是这种行为是否得到保证。 Eg " Appending std::vector to itself, undefined behavior? " says it's not for std::vector . 例如“ 将std :: vector附加到自身,未定义的行为? ”说它不适用于std::vector

According to §21.4.6.2/§21.4.6.3: 根据§21.4.6.2/§21.4.6.3:

The function [ basic_string& append(const charT* s, size_type n); 函数[ basic_string& append(const charT* s, size_type n); ] replaces the string controlled by *this with a string of length size() + n whose first size() elements are a copy of the original string controlled by *this and whose remaining elements are a copy of the initial n elements of s. ]将由* this控制的字符串替换为长度为size()+ n的字符串,其第一个size()元素是由* this控制的原始字符串的副本,其余元素是s的初始n个元素的副本。

Note: This applies to every append call, as every append can be implemented in terms of append(const charT*, size_type) , as defined by the standard (§21.4.6.2/§21.4.6.3). 注意:这适用于每个append调用,因为每个append都可以按照标准(§21.4.6.2/§21.4.6.3)定义的append(const charT*, size_type)来实现。

So basically, append makes a copy of str (let's call the copy strtemp ), appends n characters of str2 to strtemp , and then replaces str with strtemp . 所以基本上, append会生成str的副本(让我们调用copy strtemp ),将str2 n字符追加到strtemp ,然后用strtemp替换str

For the case that str2 is str , nothing changes, as the string is enlarged when the temporary copy is assigned, not before. 对于str2str的情况,没有任何变化,因为在分配临时副本时,字符串会被放大,而不是之前。

Even though it is not explicitly stated in the standard, it is guaranteed (if the implementation is exactly as stated in the standard) by the definition of std::basic_string<T>::append . 尽管标准中没有明确说明,但是std::basic_string<T>::append的定义保证了(如果实现与标准中的完全一致)。

Thus, this is not undefined behavior. 因此,这不是未定义的行为。

This is complicated. 这是复杂的。

One thing that can be said for certain. 有一点可以肯定。 If you use iterators: 如果使用迭代器:

std::string str = "abcdef";
str.append(str.begin(), str.end());

then you are guaranteed to be safe. 那么你保证安全。 Yes, really. 对真的。 Why? 为什么? Because the specification states that the behavior of the iterator functions is equivalent to calling append(basic_string(first, last)) . 因为规范声明迭代器函数的行为等同于调用append(basic_string(first, last)) That obviously creates a temporary copy of the string. 这显然会创建一个字符串的临时副本。 So if you need to insert a string into itself, you're guaranteed to be able to do it with the iterator form. 因此,如果您需要在自身中插入一个字符串,那么您可以保证能够使用迭代器形式。

Granted, implementations don't have to actually copy it. 当然,实现不必实际复制它。 But they do need to respect the standard specified behavior. 但他们确实需要尊重标准的指定行为。 An implementation could choose to make a copy only if the iterator range is inside of itself, but the implementation would still have to check. 只有当迭代器范围在其自身内部时,实现才可以选择复制,但实现仍然需要检查。

All of the other forms of append are defined to be equivalent to calling append(const charT *s, size_t len) . 所有其他形式的append都被定义为等同于调用append(const charT *s, size_t len) That is, your call to append above is equivalent to you doing append(str.data(), str.size()) . 也就是说,上面追加的调用等同于你在append(str.data(), str.size()) So what does the standard say about what happens if s is inside of *this ? 那么,如果s*this中,那标准会说什么呢?

Nothing at all. 什么都没有。

The only requirement of s is: s的唯一要求是:

s points to an array of at least n elements of charT . s指向至少ncharT元素的charT

Since it does not expressly forbid s pointing into *this , then it must be allowed. 因为它没有明确禁止s指向*this ,那么它必须被允许。 It would also be exceedingly strange if the iterator version allows self-assignment, but the pointer&size version did not. 如果迭代器版本允许自我赋值,那么它也会非常奇怪,但指针和大小版本却没有。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM