简体   繁体   中英

Declaring 64-bit variables in C

I have a question.

uint64_t var = 1; // this is 000000...00001 right?

And in my code this works:

var ^ (1 << 43)

But how does it know 1 should be in 64 bits? Shouldn't I write this instead?

var ^ ( (uint64_t) 1 << 43 )

As you supposed, 1 is a plain signed int (which probably on your platform is 32 bit wide in 2's complement arithmetic), and so is 43, so by any chance 1<<43 results in an overflow: in facts, if both arguments are of type int operator rules dictate that the result will be an int as well.

Still, in C signed integer overflow is undefined behavior, so in line of principle anything could happen. In your case, probably the compiler emitted code to perform that shift in a 64 bit register, so by luck it appears to work; to get a guaranteed-correct result you should use the second form you wrote, or, in alternative, specify 1 as an unsigned long long literal using the ull suffix ( unsigned long long is guaranteed to be at least 64 bit).

var ^ ( 1ULL << 43 )

I recommend OP's approach, cast the constant ( (uint64_t) 1 << 43 )

For OP's small example, the 2 below will likely perform the same.

uint64_t var = 1; 
// OP solution)
var ^ ( (uint64_t) 1 << 43 )
// Others suggested answer
var ^ ( 1ULL << 43 )        

The above results have the same value , but different types . The potential difference lies in how 2 types exist in C: uint64_t and unsigned long long and what may follow.

uint64_t has an exact range 0 to 2 64 -1.
unsigned long long has a range 0 to at least 2 64 -1.

If unsigned long long will always be 64-bits, as it seems to be on many a machine there days, there is no issue, but let's look to the future and say this code was run on a machine where unsigned long long was 16 bytes (0 to at least 2 128 -1).

A contrived example below: The first result of the ^ is a uint64_t , when multiplied by 3, the product will still be uint64_t , performing a modulo 2 64 , should overflow occur, then the result is assigned to d1 . In the next case, the result of ^ is an unsigned long long and when multiplied by 3, the product may be bigger than 2 64 which is then assigned to d2 . So d1 and d2 have a different answer.

double d1, d2;
d1 = 3*(var ^ ( (uint64_t) 1 << 43 ));
d2 = 3*(var ^ ( 1ULL << 43 ));

If one wants to work with unit64_t , be consistent. Do not assume unit64_t and unsigned long long are the same. If it is OK for your answer to be a unsigned long long , fine. But in my experience, if one starts using fixed sized types like uint64_t , one does not want variant size types messing up the computations.

var ^ ( 1ULL << 43 )应该这样做。

A portable way to have a unit64_t constant is to use UINT64_C macro (from stdint.h ):

UINT64_C(1) << 43

Most likely UINT64_C(c) is defined to something like c ## ULL .

From the C standard:

The macro INT N _C(value) shall expand to an integer constant expression corresponding to the type int_least N _t . The macro UINTN_ C (value) shall expand to an integer constant expression corresponding to the type uint_least N _t . For example, if uint_least64_t is a name for the type unsigned long long int , then UINT64_C(0x123) might expand to the integer constant 0x123ULL .

Your compiler doesn't know that the shift should be done in 64 bits. However, with this particular version of the compiler in this particular configuration for this particular code, two wrongs happen to make a right. Don't count on it.

Assuming that int is a 32-bit type on your platform (which is very likely), the two wrongs in 1 << 43 are:

  • If the shift amount is greater than or equal to the width of the type of the left operand, the behavior is undefined. This means that if x is of type int or unsigned int , then x << 43 has undefined behavior, as does x << 32 or any other x << n where n ≥ 32. For example 1u << 43 would have undefined behavior too.
  • If the left operand has a signed type, and the result of the operation overflows that type, then the behavior is undefined. For example 0x12345 << 16 has undefined behavior, because the type of the left operand is the signed type int but the result value doesn't fit in int . On the other hand, 0x12345u << 16 is well-defined and has the value 0x23450000u .

“Undefined behavior” means that the compiler is free to generate code that crashes or returns a wrong result. It so happens that you got the desired result in this case — this is not forbidden, however Murphy's law dictates that one day the generated code won't do what you want.

To guarantee that the operation takes place on a 64-bit type, you need to ensure that the left operand is a 64-bit type — the type of the variable that you're assigning the result to doesn't matter. It's the same issue as float x = 1 / 2 resulting in x containing 0 and not 0.5: only the types of the operands matter to determine the behavior of the arithmetic operator. Any of (uint64)1 << 43 or (long long)1 << 43 or (unsigned long long)1 << 43 or 1ll << 43 or 1ull << 43 will do. If you use a signed type, then the behavior is only defined if there is no overflow, so if you're expecting truncation on overflow, be sure to use an unsigned type. An unsigned type is generally recommended even if overflow isn't supposed to happen because the behavior is reproducible — if you use a signed type, then the mere act of printing out values for debugging purposes could change the behavior (because compilers like to take advantage of undefined behavior to generate whatever code is most efficient on a micro level, which can be very sensitive to things like pressure on register allocation).

Since you intend the result to be of type uint64_t , it is clearer to perform all computations with that type. Thus:

uint64_t var = 1;
… var ^ ((uint64_t)1 << 43) …

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM