简体   繁体   中英

Get size of a std::string's string in bytes

I would like to get the bytes a std::string 's string occupies in memory, not the number of characters. The string contains a multibyte string. Would std::string::size() do this for me?

EDIT: Also, does size() also include the terminating NULL ?

std::string operates on bytes, not on Unicode characters, so std::string::size() will indeed return the size of the data in bytes (without the overhead that std::string needs to store the data, of course).

No, std::string stores only the data you tell it to store (it does not need the trailing NULL character). So it will not be included in the size, unless you explicitly create a string with a trailing NULL character.

You could be pedantic about it:

std::string x("X");

std::cout << x.size() * sizeof(std::string::value_type);

But std::string::value_type is char and sizeof(char) is defined as 1.

This only becomes important if you typedef the string type (because it may change in the future or because of compiler options).

// Some header file:
typedef   std::basic_string<T_CHAR>  T_string;

// Source a million miles away
T_string   x("X");

std::cout << x.size() * sizeof(T_string::value_type);

std::string::size() is indeed the size in bytes.

To get the amount of memory in use by the string you would have to sum the capacity() with the overhead used for management. Note that it is capacity() and not size() . The capacity determines the number of characters ( charT ) allocated, while size() tells you how many of them are actually in use.

In particular, std::string implementations don't usually *shrink_to_fit* the contents, so if you create a string and then remove elements from the end, the size() will be decremented, but in most cases (this is implementation defined) capacity() will not.

Some implementations might not allocate the exact amount of memory required, but rather obtain blocks of given sizes to reduce memory fragmentation. In an implementation that used power of two sized blocks for the strings, a string with size 17 could be allocating as much as 32 characters.

Yes, size() will give you the number of char in the string. One character in multibyte encoding take up multiple char .

There is inherent conflict in the question as written: std::string is defined as std::basic_string<char,...> -- that is, its element type is char (1-byte), but later you stated "the string contains a multibyte string" ("multibyte" == wchar_t ?).

The size() member function does not count a trailing null. It's value represents the number of characters (not bytes).

Assuming you intended to say your multibyte string is std::wstring (alias for std::basic_string<wchar_t,...> ), the memory footprint for the std::wstring 's characters, including the null-terminator is:

std::wstring myString;
 ...
size_t bytesCount = (myString.size() + 1) * sizeof(wchar_t);

It's instructive to consider how one would write a reusable template function that would work for ANY potential instantiation of std::basic_string<> like this**:

// Return number of bytes occupied by null-terminated inString.c_str().
template <typename _Elem>
inline size_t stringBytes(const std::basic_string<typename _Elem>& inString, bool bCountNull)
{
   return (inString.size() + (bCountNull ? 1 : 0)) * sizeof(_Elem);
}

** For simplicity, ignores the traits and allocator types rarely specified explicitly for std::basic_string<> (they have defaults).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM