简体   繁体   中英

Are C/C++ library functions and operators the most optimal ones?

So, at the divide & conquer course we were taught:

  1. Karatsuba multiplication
  2. Fast exponentiation

Now, given 2 positive integers a and b is operator::* faster than a karatsuba(a,b) or is pow(a,b) faster than

int fast_expo(int Base, int exp)
{
    if (exp == 0) {
        return 1;
    }
    if (exp == 1) {
        return Base
    }
    if (exp % 2 == 0) {
        return fast_expo(Base, exp / 2) * fast_expo(Base, exp / 2);
    }
    else {
        return base * fast_expo(Base, exp / 2) * fast_expo(Base, exp / 2);
    }
}

I ask this because I wonder if they have just a teaching purpose or they are already base implemented in the C/C++ language

Karatsuba multiplication is a special technique for large integers. It is not comparable to the built in C++ * operator which multiplies together operands of basic type like int and double .

To take advantage of Karatsuba, you have to be using multi-precision integers made up of at least around 8 words. (512 bits, if these are 64 bit words). The break-even point at which Karatsuba becomes advantageous is at somewhere between 8 and 24 machine words, according to the accepted answer to this question .

The pow function which works with a pair of floating-point operands of type double , is not comparable to your fast_expo , which works with operands of type int . They are different functions with different requirements. With pow , you can calculate the cube root of 5: pow(5, 1/3.0) . If that's what you would like to calculate, then fast_expo is of no use, no matter how fast.

There is no guarantee that your compiler or C library's pow is absolutely the fastest way for your machine to exponentiate two double-precision floating-point numbers.

Optimization claims in floating-point can be tricky, because it often happens that multiple implementations of the "same" function do not give exactly the same results down to the last bit. You can probably write a fast my_pow that is only good to five decimal digits of precision, and in your application, that approximation might be more than adequate. Have you beat the library? Hardly; your fast function doesn't meet the requirements that would qualify it as a replacement for the pow in the library.

operator::* and other standard operators usually map to the primitives provided by the hardware. In case, such primitives don't exist (eg 64-bit long long on IA32), the compiler emulates them at a performance penalty (gcc does that in libgcc ).

Same for std::pow . It is part of the standard library and isn't mandated to be implemented in a certain way. GNU libc implements pow(a,b) as exp(log(a) * b) . exp and log are quite long and written for optimal performance with IEEE754 floating point in mind.


As for your sugestions:

Karatsuba multiplication for smaller numbers isn't worth it. The multiply machine instruction provided by the processor is already optimized for speed and power usage for the standard data types in use. With bigger numbers, 10-20 times the register capacity, it starts to pay off :

In the GNU MP Bignum Library , there used to be a default KARATSUBA_THRESHOLD as high as 32 for non-modular multiplication (that is, Karatsuba was used when n>=32w with typically w=32 ); the optimal threshold for modular exponentiation tending to be significantly higher. On modern CPUs, Karatsuba in software tends to be non-beneficial for things like ECDSA over P-256 ( n=256 , w=32 or w=64 ), but conceivably useful for much wider modulus as used in RSA.

Here is a list with the multiplication algorithms , GNU MP uses and their respective thresholds.

Fast exponentiation doesn't apply to non-integer powers, so it's not really comparable to pow .

A good way to check the speed of an operation is to measure it. If you run through the calculation a billion or so times and see how much time it takes to execute you have your answer there.

One thing to note. I'm lead to believe that % is fairly expensive. There is a much faster way to check if something is divisible by 2:

check_div_two(int number)
{
    return ((number>>1) & 0x01);
}

This way you've just done a bit shift and compared against a mask. I'd assume it's a less expensive op.

The * operator for built-in types will almost certainly be implemented as a single CPU multiplication instruction. So ultimately this is a hardware question, not a language question. Longer code sequences, perhaps function calls, might be generated in cases where there's no direct hardware support.

It's safe to assume that chip manufacturers (Intel, AMD, et al) expend a great deal of effort making arithmetic operations as efficient as possible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM