I'm using gcc 4.6.3 and creating a large array of random shorts. I'm generating them with the following statements:
val = SHRT_MAX; //as defined by limits.h
while(array<end) {
*array++ = rand() % val;
}
This is a considerably fast operation, and even for arrays as large as 5,000,000 elements is completed almost instantly. I was curious about my sorts efficiency with a smaller variation in numbers and changed that to:
val = 3;
This caused a considerable speed difference, it ran much slower than the original statements. What is it that is causing such a considerable speed difference?
SHRT_MAX
is most likely greater than or equal to RAND_MAX
. The statement:
*array++ = rand() % val;
can be optimized into:
int rand_value= rand();
if (rand_value==RAND_MAX) rand_value= 0;
*array++= rand_value;
which is faster because it replaces a modulus with a branch. The second version, where val
is 3, cannot be optimized into a simpler version that runs without modulus.
% SHRT_MAX
cannot be simplified into a bitwise operation. But combined with knowledge of how rand()
is specified, the compiler can certainly optimize statements dealing with rand()
and values greater than or equal to RAND_MAX
.
Compilers can optimize calculation of modulo (a%B), where B is a constant. It replaces actual modulo with simpler arithmetic operations. The details are explained in topics like Most optimized way to calculate modulus in C . However such optimizations are faster for some values of B than for others.
Even CPU division/modulo instruction can take different number of cycles complete (at least on some CPUs). See numbers for x86 here: http://gmplib.org/~tege/x86-timing.pdf .
SHRT_MAX is a 2^n-1
value, which can be optimised for divide. Dividing by 3 is much tougher, so the compiler may well decide to divide by 3 (or do some other magic operation that is slower than the 2^n-1
variant.
The fastest modulo you can use is for 2^n
, which can be replaced with a single and-instruciton, for positive values: x % 256
is the same as x & 255
. Unfortunately, when the value may be negative, it's not quite so easy...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.