简体   繁体   中英

The max number of digits in an int based on number of bits

So, I needed a constant value to represent the max number of digits in an int, and it needed to be calculated at compile time to pass into the size of a char array.

To add some more detail: The compiler/machine I'm working with has a very limited subset of the C language, so none of the std libraries work as they have unsupported features. As such I cannot use INT_MIN/MAX as I can neither include them, nor are they defined.

I need a compile time expression that calculates the size. The formula I came up with is:

((sizeof(int) / 2) * 3 + sizeof(int)) + 2

It is marginally successful with n byte integers based on hand calculating it.

sizeof(int)  INT_MAX               characters    formula
2           32767                       5            7
4           2147483647                  10           12
8           9223372036854775807         19           22

You're looking for a result related to a logarithm of the maximum value of the integer type in question ( which logarithm depends on the radix of the representation whose digits you want to count). You cannot compute exact logarithms at compile time, but you can write macros that estimate them closely enough for your purposes, or that compute a close enough upper bound for your purposes. For example, see How to compute log with the preprocessor .

It is useful also to know that you can convert between logarithms in different bases by multiplying by appropriate constants. In particular, if you know the base- a logarithm of a number and you want the base- b logarithm, you can compute it as

log b ( x ) = log a ( x ) / log a (b)

Your case is a bit easier than the general one, though. For the dimension of an array that is not a variable-length array, you need an "integer constant expression". Furthermore, your result does not need more than two digits of precision (three if you wanted the number of binary digits) for any built-in integer type you'll find in a C implementation, and it seems like you need only a close enough upper bound.

Moreover, you get a head start from the sizeof operator, which can appear in integer constant expressions and which, when applied to an integer type, gives you an upper bound on the base-256 logarithm of values of that type (supposing that CHAR_BIT is 8). This estimate is very tight if every bit is a value bit, but signed integers have a sign bit, and they may have padding bits as well, so this bound is a bit loose for them.

If you want aa bound on the number of digits in a power-of-two radix then you can use sizeof pretty directly. Let's suppose, though, that you're looking for the number of decimal digits. Mathematically, the maximum number of digits in the decimal representation of an int is

N = ceil(log 10 ( MAX_INT ))

or

N = floor(log 10 ( MAX_INT )) + 1

provided that MAX_INT is not a power of 10. Let's express that in terms of the base-256 logarithm:

N = floor( log 256 ( MAX_INT ) / log 256 (10) ) + 1

Now, log 256 (10) cannot be part of an integer constant expression, but it or its reciprocal can be pre-computed: 1 / log 256 (10) = 2.40824 (to a pretty good approximation; the actual value is slightly less). Now, let's use that to rewrite our expression:

N <= floor( sizeof(int) * 2.40824 ) + 1

That's not yet an integer constant expression, but it's close. This expression is an integer constant expression, and a good enough approximation to serve your purpose:

N = 241 * sizeof(int) / 100 + 1

Here are the results for various integer sizes:

sizeof(int)          INT_MAX   True N  Computed N
    1                    127      3        3
    2                  32767      5        5
    4             2147483648     10       10
    8       ~9.223372037e+18     19       20

(The values in the INT_MAX and True N columns suppose one of the allowed forms of signed representation, and no padding bits; the former and maybe both will be smaller if the representation contains padding bits.)

I presume that in the unlikely event that you encounter a system with 8-byte int s, the extra one byte you provide for your digit array will not break you. The discrepancy arises from the difference between having (at most) 63 value bits in a signed 64-bit integer, and the formula accounting for 64 value bits in that case, with the result that sizeof(int) is a bit too much of an overestimation of the base-256 log of INT_MAX . The formula gives exact results for unsigned int up to at least size 8, provided there are no padding bits.

As a macro, then:

// Expands to an integer constant expression evaluating to a close upper bound
// on the number the number of decimal digits in a value expressible in the
// integer type given by the argument (if it is a type name) or the the integer
// type of the argument (if it is an expression). The meaning of the resulting
// expression is unspecified for other arguments.
#define DECIMAL_DIGITS_BOUND(t) (241 * sizeof(t) / 100 + 1)

An upper bound on the number of decimal digits an int may produce depends on INT_MIN .

// Mathematically 
max_digits = ceil(log10(-INT_MAX))

It is easier to use the bit-width of the int as that approximates a log of -INT_MIN . sizeof(int)*CHAR_BIT - 1 is the max number of value bits in an int .

// Mathematically 
max_digits = ceil((sizeof(int)*CHAR_BIT - 1)* log10(2))
// log10(2) --> ~ 0.30103

On rare machines, int has padding, so the above will over estimate.

For log10(2), which is about 0.30103, we could use 1/3 or one-third.

As a macro, perform integer math and add 1 for the ceiling

#include <stdlib.h>
#define INT_DIGIT10_WIDTH ((sizeof(int)*CHAR_BIT - 1)/3 + 1)

To account for a sign and null character add 2, use the following. With a very tight log10(2) fraction to not over calculate the buffer needs:

#define INT_STRING_SIZE ((sizeof(int)*CHAR_BIT - 1)*28/93 + 3)

Note 28/93 = ‭0.3010752... > log2(10)


The number of digits needed for any base down to base 2 would need follows below. It is interesting that +2 is needed and not +1. Consider a 2 bit signed number in base 2 could be "-10" , a size of 4.

#define INT_STRING2_SIZE ((sizeof(int)*CHAR_BIT + 2)

Boringly, I think you need to hardcode this, centred around inspecting sizeof(int) and consulting your compiler documentation to see what kind of int you actually have. (All the C standard specifies is that it can't be smaller than a short , and needs to have a range of at least -32767 to +32767, and 1's complement, 2's complement, and signed magnitude can be chosen. The manner of storage is arbitrary although big and little endianness are common.) Note that an arbitrary number of padding bits are allowed , so you can't, in full generality, impute the number of decimal digits from the sizeof .

C doesn't support the level of compile time evaluable constant expressions you'd need for this.

So hardcode it and make your code intentionally brittle so that compilation fails if a compiler encounters a case that you have not thought of.

You could solve this in C++ using constexpr and metaprogramming techniques.

((sizeof(int) / 2) * 3 + sizeof(int)) + 2

is the formula I came up with. The +2 is for the negative sign and the null terminator.

If we suppose that integral values are either 2, 4, or 8 bytes, and if we determine the respective digits to be 5, 10, 20, then a integer constant expression yielding the exact values could be written as follows:

const int digits = (sizeof(int)==8) ? 20 : ((sizeof(int)==4) ? 10 : 5);
int testArray[digits];

I hope that I did not miss something essential. I've tested this at file scope.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM