why a uint32_t vs uint64_t speed difference?

Question

Trying to understand something about how g++/cpu processes integers at runtime.

I'm measuring how long the following function takes to run:

template<class T>
void speedTest() {
    for(T d=0;d<4294967295u;d++)int number;
}

This simple method will run a dumb loop the max value of uint32_t many times

and when I call:

speedTest<uint32_t>();

the software takes an average of 8.15 seconds but when I call:

speedTest<uint64_t>();

the software takes an average of 10.35 seconds.

Why is this happening?

Answer 1

Some possible reasons:

Larger data types require more memory bandwidth in general
Even if that loop counter is kept inside a register, the CPU is probably taking more time to do calculations with large values, especially, if it needs multiple registers (eg if your CPU has just 32bit wide registers)
The compiler would need to emit extra machine instructions to emulate any type not directly supported by the CPU
It also depends on optimization. Such a loop without side effects could be optimized out completely, regardless of int number; (could just be for(T d=0;d<4294967295u;d++); )

You could continue your investigation/exercise by providing some assembly.