简体   繁体   中英

The difference between size of datatype and sizeof(data type)

I was learning C++ and come across the following question. I'm just a beginner and I got confused. Isn't sizeof() function supposed to return the size of the datatype? Why could a data object has different size from its sizeof()? I don't understand the explanation of the answer.

Suppose in a hypothetical machine, the size of char is 32 bits. What would sizeof(char) return?

a) 4

b) 1

c) Implementation dependent

d) Machine dependent

Answer:b

Explanation: The standard does NOT require a char to be 8-bits, but does require that sizeof(char) return 1.

The sizeof operator yields the size of a type in bytes , where a byte is defined to be the size of a char . So sizeof(char) is always 1 by definition, regardless of how many bits char has on a given platform.

This applies to both C and C++.


From the C11 standard, 6.5.3.4

  1. The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand....

Then,

  1. When sizeof is applied to an operand that has type char , unsigned char , or signed char , (or a qualified version thereof) the result is 1.

From the C++11 standard, 5.3.3

  1. The sizeof operator yields the number of bytes in the object representation of its operand. The operand is either an expression, which is an unevaluated operand (Clause 5), or a parenthesized type-id .... ... sizeof(char) , sizeof(signed char) and sizeof(unsigned char) are 1.

(emphasis mine)

Per 5.3.3 [expr.sizeof]

The sizeof operator yields the number of bytes in the object representation of its operand. The operand is either an expression, which is an unevaluated operand (Clause 5), or a parenthesized type-id. The sizeof operator shall not be applied to an expression that has function or incomplete type, to an enumeration type whose underlying type is not fixed before all its enumerators have been declared, to the parenthesized name of such types, or to a glvalue that designates a bit-field. sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1. [...]

emphasis mine

So no matter how many bits a char takes up its size is always 1

You're just confused with the difference between bytes and octets .

A byte is the size of one character. This yields to the always true sizeof(char) == 1 , because sizeof return the size in bytes

While an octet consists out of 8 bits .

On almost all modern platforms, the size of a byte is coincidentally the same as of an octet. That's the reason why it's a common error to mix them up, even book authors and professors are doing this.

sizeof(x)返回x的大小,以char的大小为单位表示。

There are no machines where sizeof(char) is 4. It's always 1 byte. That byte might contain 32 bits, but as far as the C compiler is concerned, it's one byte.

The correct name for "8 bits" is octet. The C Standard uses the word "byte" for an object that is the size of a char. Others may use the word "byte" in different ways, often when they mean "octet", but in C (and C++, or Objective-C) it means "object the size of a char". A char may be more than 8 bits, or more than one octet, but it's always one byte.

The question should have been -- Suppose in a hypothetical machine, the word size(size of registers) is 32 bits. What would sizeof(char) return?

And answer will be 1 byte .

In computing, word is a term for the natural unit of data used by a particular processor design. A word is a fixed-sized piece of data handled as a unit by the instruction set or the hardware of the processor. The number of bits in a word (the word size, word width, or word length) is an important characteristic of any specific processor design or computer architecture. -- https://en.wikipedia.org/wiki/Word_%28computer_architecture%29

In you case word-size will be 32 bits. Also

Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit of memory in many computer architectures. -- https://en.wikipedia.org/wiki/Byte

1 byte is smallest addressable unit of memory, it cat be 8 bit , 9 bits or 16 bits anything that hardware spec chooses.

As far as sizeof is concerned it first determines the type of argument, eventually computes the size in bytes. So, following two C++ statements will produce same result.

  int n;
  std::cout<<sizeof(int);
  std::cout<<sizeof(n);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM