简体   繁体   中英

Best way to assign 32 bit value to 64 bit variable and guarantee top 32 bits are 0 in C

I have some C code that has variables which are either ints, or cast to int for a period of time for easy use (what we care about is the bit value). Int will always be 32 bit in this case. At one point some of them are assigned to a 64 bit variable, some implicitly and some explicitly:

long long 64bitfoo = 32bitbar;
long long 64bitfoo = (long long)32bitbar;

This has not been a problem in the past, but recently I ran into a case where after this conversion the top 32 bits of the 64 bit variable are not 0. It seems that some specific version of events can more or less populate the top bits with garbage (or just choose a previously used memory location and not clear it out correctly). This won't do, so I'm looking at solutions.

I can obviosuly do something like this:

long long 64bitfoo = 32bitbar;
64bitfoo &= ~0xFFFFFFFF00000000;

to clear out the top bits, and this should work for what I need, but I feel like there are better options. So far this has only shown up on values that use the implicit casting, so I'm curious if there is a difference between implicit and explicit casting that would allow explicit casting to handle this itself?(unfortunately I currently can't just add the explicit casting and do a test, the conditions to trigger this are complex and not easily replicated, so code changes need to be pretty firm and not guesses).

I'm sure there might be other options as well, doing something instead of just using = to set the value, a different way to clear the top 32 bits that is better, or some way of setting the initial 64 bit value to guarantee the top bits stay clear if only the bottom bits are set (the 64 bit variable sometimes gets other 64 bit variables assigned to it, so it can't have the top bits forced to 0 at all times). Wasn't finding a lot when searching, this doesn't seem to be something that comes up much.

edit: I forgot to mention that there is instances where it being signed doesn't seem like the problem. One example is the initial value was 0xF8452370, then the long long value was shown as -558965697074093200, which is 0xF83E27A8F8452370. So the bottom 32 bits are the same, but the top 32 bits are not just 1's, but a scattering of 1's and 0's. As far as I understand, there's no reason signed vs unsigned would do this (all 1's sure), but I could definitely be mistaken.

Also, the 64 bit variable I think needs to be signed, as at other instances it takes in values that need to be either negative or positive (actual integers) vs in these instances where it just needs to keep track of the bit values. It is a very multi-use variable and I do not have the ability to make it not multi-use.

edit2: Its very possible I am asking the wrong question here, trying to keep an eye on that. But I am working within restrictions, so the actual problem might be something else, and I might just be stuck adding a bandaid for now. The quick rundown is this:

  1. There is a 64 bit variable that is long long (or __int64 on certain systems, but in the instances I am running into it should always be long long). I can not change what it is, or make it unsigned.

  2. I have a function returning a 32 bit memory address. I need to assign that 32 bit memory address (not as a pointer, but as the actual value of the memory location) to this 64 bit variable.

  3. In these cases I need the top 32 bits of the 64 bit variable to be 0, and the bottom 32 bits to be the same as the original value. Sometimes they are not 0, but they aren't always 1.

  4. Because I can't change the 64 bit variable to unsigned I think my best option, with what I have, is to manually clear the top 32 bits, and am looking for the best way to do that.

You're running into sign extension -- casting a negative signed value to a larger type will "extend" the sign bit of the original value to all the upper bits of the new type, so that the numeric value is preserved. For instance, (int8_t) 0xFC = -4 converts to (int16_t) 0xFFFC = -4 . The extra bits aren't "garbage"; they have a very specific purpose and meaning.

If you want to avoid this, cast through an unsigned type. For example:

long long sixtyfourbits = (unsigned int) thirtytwobits;

As a side point, I'd advise that you use the <stdint.h> integer types throughout your code if you care about their size -- for instance, use int64_t instead of long long , and uint32_t instead of unsigned int . The names will more clearly indicate your intent, and there are some platforms which use different sizes for standard C types. (For instance, AVR microcontrollers use a 16-bit int .)

what we care about is the bit value

Then you should stay away from signed types and always use unsigned .

When a signed (or unsigned) type is converted to a bigger size of the same type, the value is preserved, ie 19 becomes 19 and -19 becomes -19.

But signed types doesn't always preserve the binary pattern by adding zeros in the front when going from a smaller type to a bigger type whereas unsigned types do.

For 2's complement (the most common representation of signed types), all negative values will be signed extended which simply means that ones are added in front instead of zero

SIGNED:
8 bit:  -3 -> FD
16 bit: -3 -> FFFD
32 bit: -3 -> FFFFFFFD
64 bit: -3 -> FFFFFFFFFFFFFFFD

UNSIGNED:
8 bit:  253 -> FD
16 bit: 253 -> 00FD
32 bit: 253 -> 000000FD
64 bit: 253 -> 00000000000000FD

It seems that some specific version of events can more or less populate the top bits with garbage

No, either the new extra bits will be all zeros or they will be all ones.

If that isn't the case your system doesn't comply to the C standard.

'Union' is the nature way in C language for this question.

typedef union {
    struct {
        int low;    // low 32 bits
        int high;   // high 32 bits 
    } e;            // 32bits mimic x86 CPU eax 
    __int64 r;      // 64bits mimic x86 CPU rax 
} Union64;
Union64 data; 
data.r = 0x1122334455667788;

Then data.e.high will be 0x11223344 and data.e.low will be 0x55667788

Vise versa

data.e.high = 0xaabbccdd;
data.e.low  = 0x99eeff00;

Then data.r will be 0xaabbccdd99eeff00

In your case

data.r = 0; // guarantee data.e.high is cleared
data.e.low = 32bitbar; // say 0x11223344

Then data.r will be 0x0000000011223344;

This is exactly what union is for.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM