简体   繁体   中英

Memory allocation for strings in vectors

If a vector always provides contiguous memory storage, how does the compiler allocate memory to empty std::strings?

I have a vector to which I've pushed a number of classes with std:string as a private member. I then pass a reference to the vector as an argument to another method.

Is the string's data elsewhere in the heap referenced from the vector's contiguous array?

Allocating memory for std::string is trivial .

Internally, it'll have some sort of pointer that points to a block of memory in which the actual string data will be stored. So, allocating memory for a std::string is simply a matter of allocating space for a pointer, a size_t or something, and maybe a couple more primitives.

If you have a std::vector<std::string> for example, it's easy for the vector to allocate space for the std::string 's because they're just k bytes each for some constant k . The string data will not be involved in this allocation.

The details of what happens really in memory in this case are quite dependent on the specific STL implementation you're using.

Having said that, my impression is that in most implementations vector and string are implemented with something like ( very simplified):

template<typename T>
class vector
{
  //...
  private:
    T* _data;
};

class string
{
  private:
    char _smallStringsBuffer[kSmallSize];
    char* _bigStringsBuffer;
};

The vector's data is dynamically allocated on the heap based on the capacity (which has got a default value when default-initialized, and grows while you add elements to the vector).

The string's data is statically allocated for small strings (implementation-dependent value of "small"), and then dynamically when the string becomes bigger. This is the case for a number of reasons but mostly to allow more efficient handling of small strings.

The example you described is something like:

void MyFunction(const vector<string>& myVector)
{
  // ...
}

int main()
{
  vector<string> v = ...;

  // ...

  MyFunction(v);

  // ...

  return 0;
}

In this particular case only the basic data of the vector v will be in the stack, as v._data will be allocated on the heap. If v has capacity N , v._data's size in the heap will be sizeof(string) * N, where the size of the string is a constant that will depend on kSmallSize * sizeof(char) + sizeof(char*), based on the definition of the string above.

As for contiguous data, only if all strings collected in the vector have fewer characters than kSmallSize, will their data be "almost" contiguous in memory.

This is an important consideration for performance-critical code, but to be honest I don't think that most people would rely on standard STL's vectors and strings for such situations, as the implementation details can change over time and on different platforms and compilers. Furthermore, whenever your code goes out of the "fast" path, you won't notice except with spikes of latency that are going to be hard to keep in check.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM