简体   繁体   中英

How to convert 0/1 to sign efficiently?

Look at the following function, where a is an unsigned byte 0-255, and b is a float:

def convert(a, b):
    if a & 0x80:
        return -b
    return b

It negates b if the first bit of a is set, but does nothing when it isn't. One could think that this isn't that cool, since conditional statements ruin the branch prediction in the CPU. Therefore one would try to convert this into a computation.

But I only found this solution, which doesn't look that efficient:

def convert(a, b):
    return (-1)**(a & 0x80) * b

Which one is more efficient? Does the compiler simplify the second one? Is there a better way?

This is Python. There is no compiler in the sense you're probably thinking of. Assuming you're using CPython (the reference interpreter), everything runs through a loop over a gigantic switch statement that reads and interprets each byte code as it goes. Your worries about branch prediction are irrelevant here; there will be half a dozen CPU level branches in every operation you perform, between the switch , the type checks, the dynamic function pointer lookup and call, etc. A distant jump might hurt the data cache a little when it ends up reading a byte code a few hundred bytes away instead of the next byte code, but branch prediction (or lack thereof) is not the issue (a 100% predictable jump would have the same issue).

Basically, whatever you do here that might work in C and get optimized to ideal code by the compiler's optimizer will not work in CPython. So don't bother. Write your complete code, profile it if it's too slow, and then work to optimize the "hottest" (most called) parts. You're engaged in premature optimization here, and really should stop.

If I were you, I'd go with option #1 (possibly replacing if a & 0x80: with if a >= 0x80: since the former needs to return an int which must then be truth-tested more expensively, while the latter returns bool directly, which is the cheapest thing to truth-test), as it's straightforward and unlikely to be terrible; only investigate other options if your program is too slow, and profiling says this particular bit of code is the hot spot.

(-1)**(a & 0x80) calculates power so it's very inefficient. In fact you can replace that with 1 if exponent & 1 == 0 else -1 with exponent being a & 0x80 . But it's even easier to get 1 and -1 from 0 or 1 by doing x*2 - 1

Some non-branching versions

return (((a >> 7) << 1) - 1)*b;
return (((a >> 6) & 0x02) - 1)*b;
return math.copysign(b, -(a >> 7));
return math.copysign(b, -(a & 0x80));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM