简体   繁体   English

64位数学运算,不会丢失任何数据或精度

[英]64 bit mathematical operations without any loss of data or precision

I believe there isn't any portable standard data type for 128 bits of data. 我相信对于128位数据,没有任何可移植的标准数据类型。 So, my question is about how efficiently 64 bit operations can be carried out without loss of data using existing standard data-types. 因此,我的问题是,使用现有的标准数据类型如何可以有效地执行64位操作而又不会丢失数据。

For example : I have following two uint64_t type variables: 例如:我有以下两个uint64_t类型变量:

uint64_t x = -1; uint64_t x = -1; uint64_t y = -1; uint64_t y = -1;

Now, how the result of mathematical operations such as x+y, xy, x*y and x/y can be stored/retrieved/printed ? 现在,如何存储/检索/打印x+y, xy, x*y and x/y等数学运算的结果?

For above variables, x+y results in value of -1 which is actually a 0xFFFFFFFFFFFFFFFFULL with a carry 1. 对于上述变量,x + y的值为-1,实际上是带有进位1的0xFFFFFFFFFFFFFFFFFFULL。

void add (uint64_t a, uint64_t b, uint64_t result_high, uint64_t result_low)
{
    result_low = result_high = 0;
    result_low  = a + b;
    result_high += (result_low < a);
}

How other operations can be performed as like add , which gives proper final output ? 如何执行其他操作(如add ,以提供适当的最终输出?

I'd appreciate if someone share the generic algorithm which take care of overflow/underflow etcetera that might comes into picture using such operations. 如果有人共享通用算法,可以处理此类操作可能引起的上溢/下溢等问题,我将不胜感激。

Any standard tested algorithms which might can help. 任何经过标准测试的算法都可以提供帮助。

There are lot of BigInteger libraries out there to manipulate big numbers. 有很多BigInteger库可以操纵大数。

  1. GMP Library GMP库
  2. C++ Big Integer Library C ++大整数库

If you want to avoid library integration and your requirement is quite small, here is my basic BigInteger snippet that I generally use for problem with basic requirement. 如果您想避免库集成并且您的需求很小,这是我的基本BigInteger代码段,通常用于解决基本需求问题。 You can create new methods or overload operators according your need. 您可以根据需要创建新方法或重载运算符。 This snippet is widely tested and bug free. 此代码段已经过广泛测试并且没有错误。

Source 资源

class BigInt {
public:
    // default constructor
    BigInt() {}

    // ~BigInt() {} // avoid overloading default destructor. member-wise destruction is okay

    BigInt( string b ) {
        (*this) = b;    // constructor for string
    }

    // some helpful methods
    size_t size() const { // returns number of digits
        return a.length();
    }
    BigInt inverseSign() { // changes the sign
        sign *= -1;
        return (*this);
    }
    BigInt normalize( int newSign ) { // removes leading 0, fixes sign
        for( int i = a.size() - 1; i > 0 && a[i] == '0'; i-- )
            a.erase(a.begin() + i);
        sign = ( a.size() == 1 && a[0] == '0' ) ? 1 : newSign;
        return (*this);
    }

    // assignment operator
    void operator = ( string b ) { // assigns a string to BigInt
        a = b[0] == '-' ? b.substr(1) : b;
        reverse( a.begin(), a.end() );
        this->normalize( b[0] == '-' ? -1 : 1 );
    }

    // conditional operators
    bool operator < (BigInt const& b) const { // less than operator
        if( sign != b.sign ) return sign < b.sign;
        if( a.size() != b.a.size() )
            return sign == 1 ? a.size() < b.a.size() : a.size() > b.a.size();
        for( int i = a.size() - 1; i >= 0; i-- ) if( a[i] != b.a[i] )
                return sign == 1 ? a[i] < b.a[i] : a[i] > b.a[i];
        return false;
    }
    bool operator == ( const BigInt &b ) const { // operator for equality
        return a == b.a && sign == b.sign;
    }



    // mathematical operators
    BigInt operator + ( BigInt b ) { // addition operator overloading
        if( sign != b.sign ) return (*this) - b.inverseSign();
        BigInt c;
        for(int i = 0, carry = 0; i<a.size() || i<b.size() || carry; i++ ) {
            carry+=(i<a.size() ? a[i]-48 : 0)+(i<b.a.size() ? b.a[i]-48 : 0);
            c.a += (carry % 10 + 48);
            carry /= 10;
        }
        return c.normalize(sign);
    }
    BigInt operator - ( BigInt b ) { // subtraction operator overloading
        if( sign != b.sign ) return (*this) + b.inverseSign();
        int s = sign;
        sign = b.sign = 1;
        if( (*this) < b ) return ((b - (*this)).inverseSign()).normalize(-s);
        BigInt c;
        for( int i = 0, borrow = 0; i < a.size(); i++ ) {
            borrow = a[i] - borrow - (i < b.size() ? b.a[i] : 48);
            c.a += borrow >= 0 ? borrow + 48 : borrow + 58;
            borrow = borrow >= 0 ? 0 : 1;
        }
        return c.normalize(s);
    }
    BigInt operator * ( BigInt b ) { // multiplication operator overloading
        BigInt c("0");
        for( int i = 0, k = a[i] - 48; i < a.size(); i++, k = a[i] - 48 ) {
            while(k--) c = c + b; // ith digit is k, so, we add k times
            b.a.insert(b.a.begin(), '0'); // multiplied by 10
        }
        return c.normalize(sign * b.sign);
    }
    BigInt operator / ( BigInt b ) { // division operator overloading
        if( b.size() == 1 && b.a[0] == '0' ) b.a[0] /= ( b.a[0] - 48 );
        BigInt c("0"), d;
        for( int j = 0; j < a.size(); j++ ) d.a += "0";
        int dSign = sign * b.sign;
        b.sign = 1;
        for( int i = a.size() - 1; i >= 0; i-- ) {
            c.a.insert( c.a.begin(), '0');
            c = c + a.substr( i, 1 );
            while( !( c < b ) ) c = c - b, d.a[i]++;
        }
        return d.normalize(dSign);
    }
    BigInt operator % ( BigInt b ) { // modulo operator overloading
        if( b.size() == 1 && b.a[0] == '0' ) b.a[0] /= ( b.a[0] - 48 );
        BigInt c("0");
        b.sign = 1;
        for( int i = a.size() - 1; i >= 0; i-- ) {
            c.a.insert( c.a.begin(), '0');
            c = c + a.substr( i, 1 );
            while( !( c < b ) ) c = c - b;
        }
        return c.normalize(sign);
    }

    // << operator overloading
    friend ostream& operator << (ostream&, BigInt const&);

private:
    // representations and structures
    string a; // to store the digits
    int sign; // sign = -1 for negative numbers, sign = 1 otherwise
};

ostream& operator << (ostream& os, BigInt const& obj) {
    if( obj.sign == -1 ) os << "-";
    for( int i = obj.a.size() - 1; i >= 0; i--) {
        os << obj.a[i];
    }
    return os;
}

Usage 用法

BigInt a, b, c;
a = BigInt("1233423523546745312464532");
b = BigInt("45624565434216345i657652454352");
c = a + b;
// c = a * b;
// c = b / a;
// c = b - a;
// c = b % a;
cout << c << endl;

// dynamic memory allocation
BigInt *obj = new BigInt("123");
delete obj;

You can emulate uint128_t if you don't have it: 如果没有,您可以模拟uint128_t

typedef struct uint128_t { uint64_t lo, hi } uint128_t;
...

uint128_t add (uint64_t a, uint64_t b) {
    uint128_t r; r.lo = a + b; r.hi = + (r.lo < a); return r; }

uint128_t sub (uint64_t a, uint64_t b) {
    uint128_t r; r.lo = a - b; r.hi = - (r.lo > a); return r; }

Multiplication without inbuilt compiler or assembler support is a bit more difficult to get right. 没有内置编译器或汇编程序支持的乘法则很难正确实现。 Essentially, you need to split both multiplicands into hi:lo unsigned 32-bit, and perform 'long multiplication' taking care of carries and 'columns' between the partial 64-bit products. 本质上,您需要将两个被乘数都拆分为hi:lo无符号32位,并执行“长乘法”,以照顾部分64位乘积之间的进位和“列”。

Divide and modulo return 64 bit results given 64 bit arguments - so that's not an issue as you have defined the problem. 给定64位参数,除法和取模返回64位结果-所以这不是问题,因为您已经定义了问题。 Dividing 128 bit by 64 or 128 bit operands is a much more complicated operation, requiring normalization, etc. 将128位除以64或128位操作数是一个复杂得多的操作,需要规范化等。

longlong.h routines umul_ppmm and udiv_qrnnd in GMP give the 'elementary' steps for multiple-precision/limb operations. longlong.h程序umul_ppmmudiv_qrnndGMP给出了多精度/肢体行动的“基本”步骤。

In most of the modern GCC compilers __int128 type is supported which can hold a 128 bit integers. 在大多数现代GCC编译器中,都支持__int128类型,该类型可以容纳128位整数。

Example, 例,

__int128 add(__int128 a, __int128 b){
    return a + b;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 可以使用双精度来表示64位数而不会损失精度 - Can doubles be used to represent a 64 bit number without loss of precision 在C中将float64转换为uint64时,是否有精度损失? 假设只有数据的整数部分才有意义 - Any precision loss when converting float64 to uint64 in C? assuming only the whole number part of the data is meaningful 64 位操作 - 64 bit operations 如何在32位和64位模式下获得双精度操作的相同行为? - How do I get the same behavior for double precision operations in both 32-bit and 64-bit modes? 有什么理由在64位CPU上使用32位整数进行常见操作吗? - Any reason to use 32 bit integers for common operations on 64 bit CPU? 32和64位计算机中的位操作 - bit operations in 32 and 64 bit machines C的pow()函数如何输出2 ^ 1023的精确值而没有任何精度损失? - How can pow() function of C output the exact value of 2^1023 without any precision loss? 将两个 64 位数字相除并将结果存储在浮点数中而不会丢失精度 - Divide two 64-bit numbers and store the result in a float without losing precision 64位精度睡眠功能? - 64bit precision Sleep function? 在不支持双精度的C编译器上将64位双精度浮点数据转换为Uint32 - Convert 64-bit double precision floating point data to Uint32 on a C compiler that doesn't support double precision
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM