简体   繁体   中英

Does std::string need to store its character in a contiguous piece of memory?

I know that in C++98, neither std::basic_string<> nor std::vector<> were required to use contiguous storage. This was seen as an oversight for std::vector<> as soon as it was pointed out, and, if I remember correctly, got fixed with C++03.

I seem to remember having read about discussions requiring std::basic_string<> to use contiguous storage back when C++11 was still called C++0x, but I haven't followed the discussion closely back then, and am still restricted to C++03 at work, so I am not sure what became of it.

So is std::basic_string<> required to use contiguous storage? (If so, then which version of the standard required it first?)

In case you wonder: This is important if you have code passing the result of &str[0] to a function expecting a contiguous piece of memory to write to. (I know about str.data() , but for obvious reasons old code doesn't use it.)

The C++11 standard , basic_string 21.4.1.5,

The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size().

In c++03 there was no guarantee that that the elements of the string are stored continiously. [basic.string] was

  1. For a char-like type charT, the class template basic_string describes objects that can store a sequence consisting of a varying number of arbitrary char-like objects (clause 21). The first element of the sequence is at position zero. Such a sequence is also called a “string” if the given char-like type is clear from context. In the rest of this clause, charT denotes such a given char-like type. Storage for the string is allocated and freed as necessary by the member functions of class basic_string, via the Allocator class passed as template parameter. Allocator::value_type shall be the same as charT.
  2. The class template basic_string conforms to the requirements of a Sequence, as specified in (23.1.1). Additionally, because the iterators supported by basic_string are random access iterators (24.1.5), basic_string conforms to the the requirements of a Reversible Container, as specified in (23.1). 389 ISO/IEC 14882:2003(E)  ISO/IEC 21.3 Class template basic_string 21 Strings library
  3. In all cases, size() <= capacity().

And then in C++17 they changed it too

  1. The class template basic_string describes objects that can store a sequence consisting of a varying number of arbitrary char-like objects with the first element of the sequence at position zero. Such a sequence is also called a “string” if the type of the char-like objects that it holds is clear from context. In the rest of this Clause, the type of the char-like objects held in a basic_string object is designated by charT.
  2. The member functions of basic_string use an object of the Allocator class passed as a template parameter to allocate and free storage for the contained char-like objects.233
  3. A basic_string is a contiguous container (23.2.1).
  4. In all cases, size() <= capacity().

emphasis mine

So pre C++17 it was not guaranteed but now it is.

With the constraints that std::string::data imposes this non guarantee is almost moot as calling std::string::data gives you a continuous array of the characters in the string. So unless the implementation is doing this on demand and in constant time the string will be continuous.


In case you wonder: This is important if you have code passing the result of &str[0] to a function expecting a contiguous piece of memory to write to. (I know about str.data() , but for obvious reasons old code doesn't use it.)

The behavior of operator[] has changed as well. In C++03 we had

Returns: If pos < size(), returns data()[pos]. Otherwise, if pos == size(), the const version returns charT(). Otherwise, the behavior is undefined.

So only the const version was guaranteed to have defined behavior if you tried &s[0] when s is empty. In C++11 they changed it to:

Returns: *(begin() + pos) if pos < size(). Otherwise, returns a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior.

So now both the const and non const versions have defined behavior if you tried &s[0] when s is empty.

According to the draft standard N4527 21.4/3 Class template basic_string [basic.string] :

A basic_string is a contiguous container (23.2.1).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM