So, at the divide & conquer course we were taught:
Now, given 2 positive integers a and b is operator::*
faster than a karatsuba(a,b)
or is pow(a,b)
faster than
int fast_expo(int Base, int exp)
{
if (exp == 0) {
return 1;
}
if (exp == 1) {
return Base
}
if (exp % 2 == 0) {
return fast_expo(Base, exp / 2) * fast_expo(Base, exp / 2);
}
else {
return base * fast_expo(Base, exp / 2) * fast_expo(Base, exp / 2);
}
}
I ask this because I wonder if they have just a teaching purpose or they are already base implemented in the C/C++ language
Karatsuba multiplication is a special technique for large integers. It is not comparable to the built in C++ *
operator which multiplies together operands of basic type like int
and double
.
To take advantage of Karatsuba, you have to be using multi-precision integers made up of at least around 8 words. (512 bits, if these are 64 bit words). The break-even point at which Karatsuba becomes advantageous is at somewhere between 8 and 24 machine words, according to the accepted answer to this question .
The pow
function which works with a pair of floating-point operands of type double
, is not comparable to your fast_expo
, which works with operands of type int
. They are different functions with different requirements. With pow
, you can calculate the cube root of 5: pow(5, 1/3.0)
. If that's what you would like to calculate, then fast_expo
is of no use, no matter how fast.
There is no guarantee that your compiler or C library's pow
is absolutely the fastest way for your machine to exponentiate two double-precision floating-point numbers.
Optimization claims in floating-point can be tricky, because it often happens that multiple implementations of the "same" function do not give exactly the same results down to the last bit. You can probably write a fast my_pow
that is only good to five decimal digits of precision, and in your application, that approximation might be more than adequate. Have you beat the library? Hardly; your fast function doesn't meet the requirements that would qualify it as a replacement for the pow
in the library.
operator::*
and other standard operators usually map to the primitives provided by the hardware. In case, such primitives don't exist (eg 64-bit long long
on IA32), the compiler emulates them at a performance penalty (gcc does that in libgcc
).
Same for std::pow
. It is part of the standard library and isn't mandated to be implemented in a certain way. GNU libc implements pow(a,b)
as exp(log(a) * b)
. exp
and log
are quite long and written for optimal performance with IEEE754 floating point in mind.
As for your sugestions:
Karatsuba multiplication for smaller numbers isn't worth it. The multiply machine instruction provided by the processor is already optimized for speed and power usage for the standard data types in use. With bigger numbers, 10-20 times the register capacity, it starts to pay off :
In the GNU MP Bignum Library , there used to be a default
KARATSUBA_THRESHOLD
as high as 32 for non-modular multiplication (that is, Karatsuba was used whenn>=32w
with typicallyw=32
); the optimal threshold for modular exponentiation tending to be significantly higher. On modern CPUs, Karatsuba in software tends to be non-beneficial for things like ECDSA over P-256 (n=256
,w=32
orw=64
), but conceivably useful for much wider modulus as used in RSA.
Here is a list with the multiplication algorithms , GNU MP uses and their respective thresholds.
Fast exponentiation doesn't apply to non-integer powers, so it's not really comparable to pow
.
A good way to check the speed of an operation is to measure it. If you run through the calculation a billion or so times and see how much time it takes to execute you have your answer there.
One thing to note. I'm lead to believe that % is fairly expensive. There is a much faster way to check if something is divisible by 2:
check_div_two(int number)
{
return ((number>>1) & 0x01);
}
This way you've just done a bit shift and compared against a mask. I'd assume it's a less expensive op.
The *
operator for built-in types will almost certainly be implemented as a single CPU multiplication instruction. So ultimately this is a hardware question, not a language question. Longer code sequences, perhaps function calls, might be generated in cases where there's no direct hardware support.
It's safe to assume that chip manufacturers (Intel, AMD, et al) expend a great deal of effort making arithmetic operations as efficient as possible.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.