简体   繁体   中英

Space needed to store an unsigned integer into a char array and wide character array

I want to know how many bytes needed to store an unsigned integer into a character array and wide-character array.

char ar[25];
wchar_t w_ar[25];
size_t size_int;

size_int = sprintf(ar, "%u", UINT_MAX);
printf("\n size_int: %ld", size_int);

size_int = swprintf(w_ar, 25, L"%u", UINT_MAX);
printf("\n size_int: %ld", size_int);

In both the case I am getting 10 as the output. So I am going with 10, But UINT_MAX takes 4 bytes. What is this difference?

The 4 bytes (32 bits, binary digits) are the space required for the binary representation of the integer value.

10 decimal digits are required for the decimal representation. These can be represented as 10 printable characters, either using ASCII or 2-byte characters, or other encodings. So you get either one or half a decimal digit per byte.

The decimal digits also could be represented as 5 bytes of binary coded decimal values in some systems, with two decimal digits per byte, but you don't see that much now-a-days.

UINT_MAX tells you the largest value that an unsigned int may hold on your platform.

When you print it in decimal, then count the digits, that's not the same as number of bytes required to encode that value in binary (which is what happens inside your computer).

However, you can perform some arithmetic to find out how many decimal digits at maximum you may need to represent a value of this type, orask numeric_limits::digits10 to do it for you .

Note that the resulting value will be conservative, as it rounds down; use the equivalent maths directly and round up to get an upper limit. (Unfortunately, max_digits10 is defined to be 0 for the integer types . )

Don't forget that this is the number of digits, not the number of bytes needed for your string; for example, a UTF-16 string will need two bytes per digit.

Or just do it the way you did it.

You have to understand how radix work in mathematics, when dealing with positional number representation.

When you're talking about the radix, what you are saying is by how much the number vary, when moving to the next digit. In radix 10, there are 10 possible values per digit, so the value of a given digit will be 10^(x - 1) * digit x being the position, and you sum all these values to have the total value of the represented number.

We usually use decimal representation, while computer use binary representation. This means that when we use 4 octets to represent a number, the maximum number is 2^32 for an unsigned value.

However, sprintf used with the %u flag will convert that 32bits binary representation into a decimal one using character. Each character is taking enough memory space to store any character that could be represented. Assuming ascii, that's 128 different values, which take 7bits, and is stored as a byte (I used octet earlier, as it's usually a byte of 8bits on modern non-specialized hardware).

To wrap thing up, UINT_MAX, assuming 32bits integer, is 4,294,967,295 , which take exactly 10 digits using a decimal representation. Were you using signed integer, it would be half that value, plus an additional minus character when the number is negative, thus taking 11 bytes.

If you were using hexadecimal, you'd count 2digit per bytes, so only 8 bytes to represent UINT_MAX ( 0xFFFFFFFF , but the 0x is optional, and is only used to indicate that the number following is written using hexadecimal representation).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM