简体   繁体   中英

strange behavior of std::string assign,clear and operator[]

I am observing some strange behavior of string operation.

Ex :

int main()
{
  std::string name("ABCDEFGHIJ");

  std::cout << "Hello, " << name << "!\n";
  name.clear();
  std::cout << "Hello, " << name << "!\n";
  name.assign("ABCDEF",6);
  std::cout << "Hello, " << name << "!\n";
  std::cout << "Hello, " << name[8] << "!\n";
}

Output:

Hello, ABCDEFGHIJ!
Hello, !
Hello, ABCDEF!
Hello, I!

string::clear is actually not clearing because I am able to access the data even after clear. As per documentation when we are accessing something out of bound the result is undefined. But here I am getting the same result every time. Can somebody explains how it works at memory level when we call clear or opeartor[].

Welcome to C++'s amazing attraction called "undefined behavior".

When name contains a six-character string, "ABCDEF", name[8] attempts to access a nonexistent member of the string, which is undefined behavior.

Which means that the result of this operation are completely meaningless.

The C++ standard does not define the result of accessing a nonexistent member character of the string; hence the undefined behavior. The potential results of this operation can be:

  1. Some previous value that was in the string, at the given position.

  2. Some garbage, random character.

  3. Your program crashes.

  4. Anything else.

  5. A result that's different every time you execute the program, selected from options 1 through 4.

name.assign("ABCDEF",6);

Now the string has length 6. So you may legally only access elements 0 through 5.

std::cout << "Hello, " << name[8] << "!\\n";

Therefore this is Undefined Behaviour. The compiler is free to do whatever the hell it pleases. Not just with the statement, but with the whole program, even the preceding lines!

At this time, it returned the character that used to be at that position earlier. It could have returned anything else, it could have crashed, it could have skipped that statement altogether, it could have skipped the assignment and many other funny things (up to and including making daemons fly out of your nose!).

And I am saying that because all that behaviour (except the daemons) can be actually observed in the wild in various circumstances.

As others said, accessing an std::string outside it's logical boundaries (ie [0, size()], notice that size() is included) is undefined behavior, so the compiler can make anything happen.

Now, the particular flavor of UB you are seeing is nothing particularly unexpected.

clear() just zeroes the logical length of the string, but the memory that it used is retained (it's actually required by the standard, and quite some code would work way slower without this behavior).

Given that there's no good reason to waste time in zeroing out the old data, by accessing the string out of bounds you are seeing what was at that index previously.

This may change if you eg call the shrink_to_fit() method after clear() , which asks to the string to free all the extra memory it's keeping.

I'd like to add to the other answers that you can use std::string::at instead of using the operator[] .

std::string::at does boundary checking and will throw a std::out_of_range when you try to access an element that is out of range.

[I ran your code through a debugger. Take note of the capacity of the string. It is still 15. "assign" did not change the capacity. SO you won't get "garbage" value as everyone is saying. You're getting the exact same data which is stored in the same location. As stated the string is just a pointer to a memory address. It will go over x bytes to access the element. name[8] is a constant value it will go to the exact same memory location. Here is a picture of the string in debugger

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM