简体   繁体   中英

Sum signed 32-bit int with unsigned 64bit int

On my application, I receive two signed 32-bit int and I have to store them. I have to create a sort of counter and I don't know when it will be reset, but I'll receive big values and frequently. Beacause of that, in order to store these values, I decided to use two unsigned 64-bit int .

The following could be a simple version of the counter.

struct Counter
{
    unsigned int elementNr;
    unsigned __int64 totalLen1;
    unsigned __int64 totalLen2;
    
    void UpdateCounter(int len1, int len2)
    {
        if(len1 > 0 && len2 > 0)
        {
            ++elementNr;
            totalLen1 += len1;
            totalLen2 += len2;
        }
    }
}

I know that if a smaller type is casted to a bigger one (eg int to long) there should be no issues. However, passing from 32 bit rappresentation to 64 bit rappresentation and from signed to unsigned at the same time, is something new for me.

Reading around, I undertood that len1 should be expanded from 32 bit to 64 bit and then applied sign extension. Because the unsigned int and signen int have the same rank ( Section 4.13 ), the latter should be converted.

If len1 stores a negative value, passing from signed to unsigned will return a wrong value, this is why I check the positivy at the beginning of the function. However, for positive values, there should be no issues I think.

For clarity I could revrite UpdateCounter(int len1, int len2) like this

    void UpdateCounter(int len1, int len2)
    {
        if(len1 > 0 && len2 > 0)
        {
            ++elementNr;

            __int64 tmp = len1;
            totalLen1 += static_cast<unsigned __int64>(tmp);


            tmp = len2;
            totalLen2 += static_cast<unsigned __int64>(tmp);
        }
    }

Might there be some side effects that I have not considered . Is there another better and safer way to do that?

A little background, just for reference: binary operators such arithmetic addition work on operands of the same type (the specific CPU instruction to which is translated depends on the number representation that must be the same for both instruction operands). When you write something like this (using fixed width integer types to be explicit):

int32_t a = <some value>;
uint64_t sum = 0;
sum += a;

As you already know this involves an implicit conversion , more specifically an integral promotion according to integer conversion rank . So the expression sum += a; is equivalent to sum += static_cast<uint64_t>(a); , so a is promoted having the lesser rank. Let's see what happens in this example:

int32_t a = 60;
uint64_t sum = 100;
sum += static_cast<uint64_t>(a);
std::cout << "a=" << static_cast<uint64_t>(a) << "  sum=" << sum << '\n';

The output is:

a=60  sum=160

So all is all ok as expected. Let's se what happens adding a negative number:

int32_t a = -60;
uint64_t sum = 100;
sum += static_cast<uint64_t>(a);
std::cout << "a=" << static_cast<uint64_t>(a) << "  sum=" << sum << '\n';

The output is:

a=18446744073709551556  sum=40

The result is 40 as expected: this relies on the two's complement integer representation (note: unsigned integer overflow is not undefined behaviour) and all is ok, of course as long as you ensure that the sum does not become negative.

Coming back to your question you won't have any surprises if you always add positive numbers or at least ensuring that sum will never be negative... until you reach the maximum representable value std::numeric_limits<uint64_t>::max() (2^64-1 = 18446744073709551615 ~ 1.8E19). If you continue to add numbers indefinitely sooner or later you'll reach that limit (this is valid also for your counter elementNr ). You'll overflow the 64 bit unsigned integer by adding 2^31-1 (2147483647) every millisecond for approximately three months, so in this case it may be advisable to check:

#include <limits>
//...
void UpdateCounter(const int32_t len1, const int32_t len2)
{
    if( len1>0 )
    {
        if( static_cast<decltype(totalLen1)>(len1) <= std::numeric_limits<decltype(totalLen1)>::max()-totalLen1 )
        {
            totalLen1 += len1;
        }
        else
        {// Would overflow!!
            // Do something
        }
    }
}

When I have to accumulate numbers and I don't have particular requirements about accuracy I often use double because the maximum representable value is incredibly high ( std::numeric_limits<double>::max() 1.79769E+308 ) and to reach overflow I would need to add 2^32-1=4294967295 every picoseconds for 1E+279 years.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM