I have the following nested for-loop:
for(k = 0; k < n; ++k) {
for(m = 0; m < n; ++m) {
/* other logic altering a */
if(a[index] != 0) count++;
}
}
where a
contains uint32_t
. Since n
can be quite large (but constant), and this is the only branch (besides comparing k
and m
with n
), I would like to optimize this away.
The distribution of zero and non-zero in a
can be considered uniformly random.
My first approach was
count += a[index] & 1;
but then count
would only be incremented for all odd numbers.
In addition: I also have a case where a
contains bool
, but according to C++ Conditionals true
and false
are defined as non-zero and zero, which basically are equivalent to the above problem.
As stated in the comments for the question if(a[index] != 0) count++;
does not produce a branch (in this case), which was somewhat verified in the assembly.
For the sake of the completeness an equivalent to the mentioned statement are count += a[index] != 0;
(according to standard §4.7 [conv.integral])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.