简体   繁体   中英

How many bytes does a string take? A char?

I'm doing a review of my first semester C++ class, and I think I missing something. How many bytes does a string take up? A char?

The examples we were given are, some being character literals and some being strings:

'n', "n", '\n', "\n", "\\n", ""

I'm particularly confused by the usage of newlines in there.

#include <iostream>
 
int main()
{
    std::cout << sizeof 'n'   << std::endl;   // 1
    std::cout << sizeof "n"   << std::endl;   // 2
    std::cout << sizeof '\n'  << std::endl;   // 1
    std::cout << sizeof "\n"  << std::endl;   // 2
    std::cout << sizeof "\\n" << std::endl;   // 3
    std::cout << sizeof ""    << std::endl;   // 1
}
  • Single quotes indicate characters.
  • Double quotes indicate C-style strings with an invisible NUL terminator.

\n (line break) is only a single char and so is \\ (backslash). \\n is just a backslash followed by n .

  • 'n' : is not a string, is a literal char, one byte, the character code for the letter n.
  • "n" : string, two bytes, one for n and one for the null character every string has at the end.
  • "\n" : two bytes as \n stand for "new line" which takes one byte, plus one byte for the null char.
  • '\n' : same as the first, literal char, not a string, one byte.
  • "\\n" : three bytes.. one for \, one for newline and one for the null character
  • "" : one byte, just the null character.
  • A char , by definition, takes up one byte.
  • Literals using ' are char literals; literals using " are string literals.
  • A string literal is implicitly null-terminated, so it will take up one more byte than the observable number of characters in the literal.
  • \ is the escape character and \n is a newline character.

Put these together and you should be able to figure it out.

The following will take x consecutive chars in memory:

'n' - 1 char (type char)
"n" - 2 chars (above plus zero character) (type const char[2])
'\n' - 1 char
"\n" - 2 chars
"\\n" - 3 chars ('\', 'n', and zero)
"" - 1 char

edit: formatting fixed

edit2: I've written something very stupid, thanks Mooing Duck for pointing that out.

The number of bytes a string takes up is equal to the number of characters in the string plus 1 (the terminator), times the number of bytes per character. The number of bytes per character can vary. It is 1 byte for a regular char type.

All your examples are one character long except for the second to last, which is two, and the last, which is zero. (Some are of type char and only define a single character.)

'n'   - 0x6e
"n"   - 0x6e00
'\n'  - 0x0a
"\n"  - 0x0a00
"\\n" - 0x5c6e00
""    - 0x00

You appear to be referring to string constants. And distinguishing them from character constants.

A char is one byte on all architectures. A character constant uses the single quote delimiter ' .

A string is a contiguous sequence of characters with a trailing NUL character to identify the end of string. A string uses double quote characters '"'.

Also, you introduce the C string constant expression syntax which uses blackslashes to indicate special characters. \n is one character in a string constant.

So for the examples 'n', "n", '\n', "\n" :
'n' is one character
"n" is a string with one character, but it takes two characters of storage (one for the letter n and one for the NUL
'\n' is one character, the newline (ctrl-J on ASCII based systems)
"\n" is one character plus a NUL.

I leave the others to puzzle out based on those.

'n' -> One char . A char is always 1 byte. This is not a string.
"n" -> A string literal, containing one n and one terminating NULL char . So 2 bytes.
'\n' -> One char , A char is always 1 byte. This is not a string.
"\n" -> A string literal, containing one \n and one terminating NULL char . So 2 bytes.
"\\n" -> A string literal, containing one \ , one '\n', and one terminating NULL char . So 3 bytes.
"" -> A string literal, containing one terminating NULL char . So 1 byte.

May be like 10 years to late. But if you use just "Hello", its a just a array of chars, so the bytes this would take up, is the number of characters of this char array (In this case 5) + 1 (one NULL character) which would be 6 in this case. So you can take the rule: for c_strings (the char arrays): amount of characters + 1

There is also the c++ string you can access by using "include " and after std::string = "Your text here";

This c++ string always has a fixed size (on my machine 28bytes).

Depends if using UTF8 a char is 1byte if UTF16 a char is 2bytes doesn't matter if the byte is 00000001 or 10000000 a full byte is registered and reserved for the character once declared for initialization and if the char changes this register is updated with the new value.

a strings bytes is equal to the number of char between "".

example: 11111111 is a filled byte, UTF8 char T = 01010100 (1 byte)

UTF16 char T = 01010100 00000000 (2 bytes)

UTF8 string "coding" = 011000110110111101100100011010010110111001100111 (6 bytes)

UTF16 string "coding" = 011000110000000001101111000000000110010000000000011010010000000001101110000000000110011100000000 (12 bytes)

UTF8 \n = 0101110001101110 (2 bytes)

UTF16 \n = 01011100000000000110111000000000 (4 bytes)

Note: Every space and every character you type takes up 1-2 bytes in the compiler but there is so much space that unless you are typing code for a computer or game console from the early 90s with 4mb or less you shouldn't worry about bytes in regards to strings or char.

Things that are problematic to memory are calling things that require heavy computation with floats, decimals, or doubles and using math random in a loop or update methods. That would better be ran once at runtime or on a fixed time update and averaged over the time span.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM