简体   繁体   中英

Conversion to branchless of consecutive if statements

I'm stuck there trying to figure out how to convert the last two "if" statements of the following code to a branchless state.

int u, x, y;
x = rand() % 100 - 50;
y = rand() % 100 - 50;

u = rand() % 4;
if ( y > x) u = 5;
if (-y > x) u = 4;

Or, in case the above turns out to be too difficult, you can consider them as:

if (x > 0) u = 5;
if (y > 0) u = 4;

I think that what gets me is the fact that those don't have an else catcher. If it was the case I could have probably adapted a variation of a branchless abs (or max / min ) function.

The rand() functions you see aren't part of the real code. I added them like this just to hint at the expected ranges that the variables x , y and u can possibly have at the time the two branches happen.

Assembly machine code is allowed for the purpose.

EDIT:

After a bit of braingrinding I managed to put together a working branchless version:

int u, x, y;
x = rand() % 100 - 50;
y = rand() % 100 - 50;

u = rand() % 4;
u += (4-u)*((unsigned int)(x+y) >> 31);
u += (5-u)*((unsigned int)(x-y) >> 31);

Unfortunately, due to the integer arithmetic involved, the original version with if statements turns out to be faster by a 30% range.

Compiler knows where the party is at.

[All: this answer was written with the assumption that the calls on rand() were part of the problem. I offer improvement below under that assumption. OP belatedly clarifies he only used rand to tell us ranges (and presumably distribution) of the values of x and y. Unclear if he meant for the value for u, too. Anyway, enjoy my improved answer to the problem he didn't really pose].

I think you'd be better off recoding this as:

int u, x, y;
x = rand() % 100 - 50;
y = rand() % 100 - 50;

if ( y > x) u = 5;
else if (-y > x) u = 4;
else u = rand() % 4;

This calls the last rand only 1/4 as often as OP's original code. Since I assume rand (and the divides) are much more expensive than compare-and-branch, this would be a significant savings.

If your rand generator produces a lot of truly random bits (eg 16) on each call as it should, you can call it just once (I've assumed rand is more expensive than divide, YMMV):

int u, x, y, t;
t = rand() ;
u = t % 4;
t = t >> 2;
x = t % 100 - 50;
y = ( t / 100 ) %100 - 50;

if ( y > x) u = 5;
else if (-y > x) u = 4;

I think that the rand function in the MS C library is not good enough for this if you want really random values. I had to code my own; turned out faster anyway.

You might also get rid of the divide, by using multiplication by a reciprocal (untested):

int u, x, y;
unsigned int t;
unsigned long t2;
t = rand() ;
u = t % 4;

{ // Compute value of x * 2^32 in a long by multiplying.
  // The (unsigned int) term below should be folded into a single constant at compile time.
  // The remaining multiply can be done by one machine instruction
  // (typically 32bits * 32bits --> 64bits) widely found in processors.
  // The "4" has the same effect as the t = t >> 2 in the previous version
  t2 = ( t * ((unsigned int)1./(4.*100.)*(1<<32));
}
x = (t2>>32)-50; // take the upper word (if compiler won't, do this in assembler)
{ // compute y from the fractional remainder of the above multiply,
  // which is sitting in the lower 32 bits of the t2 product
  y = ( t2 mod (1<<32) ) * (unsigned int)(100.*(1<<32));
}

if ( y > x) u = 5;
else if (-y > x) u = 4;

If your compiler won't produce the "right" instructions, it should be straightforward to write assembly code to do this.

Some tricks using arrays indices, they may be quite fast if the compiler/CPU has one-step instructions to convert comparison results to 0-1 values (eg x86's "sete" and similar).

int ycpx[3];

/* ... */
ycpx[0] = 4;
ycpx[1] = u;
ycpx[2] = 5;
u = ycpx[1 - (-y <= x) + (y > x)];

Alternate form

int v1[2];
int v2[2];

/* ... */
v1[0] = u;
v1[1] = 5;
v2[1] = 4;
v2[0] = v1[y > x];
u = v2[-y > x];

Almost unreadable...

NOTE: In both cases the initialization of array elements containing 4 and 5 may be included in declaration and arrays may be made static if reentrancy is not a problem for you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM