简体   繁体   中英

Why is dividing slower than bitshifting in C++?

I wrote two pieces of code, one that divides a random number by two, and one that bitshifts the same random number right once. As I understand it, this should produce the same result. However, when I time both pieces of code, I consistently get data saying that the shifting is faster. Why is that?

Shifting code:

double iterations = atoi(argv[1]) * 1000;
int result = 0;
cout << "Doing " << iterations << " iterations." << endl;
srand(31459);
for(int i=0;i<iterations;i++){
    if(i % 2 == 0){
        result = result + (rand()>>1);
    }else{
        result = result - (rand()>>1);
    }
}

Dividing code:

double iterations = atoi(argv[1]) * 1000;
int result = 0;
cout << "Doing " << iterations << " iterations." << endl;
srand(31459);
for(int i=0;i<iterations;i++){
    if(i % 2 == 0){
        result = result + (rand() / 2);
    }else{
        result = result - (rand() / 2);
    }
}

Timing and results:

$ time ./divide 1000000; time ./shift 1000000
Doing 1e+09 iterations.

real    0m12.291s
user    0m12.260s
sys     0m0.021s
Doing 1e+09 iterations.

real    0m12.091s
user    0m12.056s
sys     0m0.019s

$ time ./shift 1000000; time ./divide 1000000
Doing 1e+09 iterations.

real    0m12.083s
user    0m12.028s
sys     0m0.035s
Doing 1e+09 iterations.

real    0m12.198s
user    0m12.158s
sys     0m0.028s

Addtional information:

  • I am not using any optimizations when compiling
  • I am running this on a virtualized install of Fedora 20, kernal: 3.12.10-300.fc20.x86_64

It's not; it's slower on the architecture you're running on. It's almost always slower because the hardware behind bit shifting is trivial, while division is a bit of a nightmare. In base 10, what's easier for you, 78358582354 >> 3 or 78358582354 / 85? Instructions generally take the same time to execute regardless of input, and in you case, it's the compiler's job to convert /2 to >>1 ; the CPU just does as it's told.

It isn't actually slower. I've run your benchmark using nonius like so:

#define NONIUS_RUNNER
#include "Nonius.h++"

#include <type_traits>
#include <random>
#include <vector>

NONIUS_BENCHMARK("Divide", [](nonius::chronometer meter)
{
    std::random_device rd;
    std::uniform_int_distribution<int> dist(0, 9);

    std::vector<int> storage(meter.runs());
    meter.measure([&](int i) { storage[i] = storage[i] % 2 == 0 ? storage[i] - (dist(rd) >> 1) : storage[i] + (dist(rd) >> 1); });
})

NONIUS_BENCHMARK("std::string destruction", [](nonius::chronometer meter)
{
    std::random_device rd;
    std::uniform_int_distribution<int> dist(0, 9);

    std::vector<int> storage(meter.runs());
    meter.measure([&](int i) { storage[i] = storage[i] % 2 == 0 ? storage[i] - (dist(rd) / 2) : storage[i] + (dist(rd) / 2); });
})

And these are the results: 在此处输入图片说明

As you can see both of them are neck and neck.

(You can find the html output here )

PS: It seems I forgot to rename the second test. My bad.

It seems that difference in resuls is bellow the results spread, so you cann't really tell if it is different. But in general division can't be done in single opperation, bit shift can, so bit shift usualy should be faster.

But as you have literal 2 in your code, I would guess that compiler, even without optimizations produces identical code.

Note that rand returns int and divide int (signed by default) by 2 is not the same as shifting by 1. You can easily check generated asm and see the difference, or simply check resulting binary size:

> g++ -O3 boo.cpp -c -o boo # divide
> g++ -O3 foo.cpp -c -o foo # shift
> ls -la foo boo
... 4016 ... boo # divide
... 3984 ... foo # shift

Now add static_cast patch:

if (i % 2 == 0) {
  result = result + (static_cast<unsigned>(rand())/2);
}
else {
  result = result - (static_cast<unsigned>(rand())/2);
}

and check the size again:

> g++ -O3 boo.cpp -c -o boo # divide
> g++ -O3 foo.cpp -c -o foo # shift
> ls -la foo boo
... 3984 ... boo # divide
... 3984 ... foo # shift

to be sure you can verify that generated asm in both binaries is the same

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM