如何制作无网段代码？

Question

Related to this answer: https://stackoverflow.com/a/11227902/4714970 与此答案相关： https ： //stackoverflow.com/a/11227902/4714970

In the above answer, it's mentioned how you can avoid branch prediction fails by avoiding branches. 在上面的答案中，提到了如何通过避免分支来避免分支预测失败。

The user demonstrates this by replacing: 用户通过替换以下内容来演示：

if (data[c] >= 128)
{
    sum += data[c];
}

With: 附：

int t = (data[c] - 128) >> 31;
sum += ~t & data[c];

How are these two equivalent (for the specific data set, not strictly equivalent)? 这两个是如何等效的（对于特定的数据集，不是严格等同的）？

What are some general ways I can do similar things in similar situations? 在类似的情况下，我可以采取哪些一般方法来做类似的事情？ Would it always be by using >> and ~ ? 它总是通过使用>>和~ ？

Answer 1

int t = (data[c] - 128) >> 31;

The trick here is that if data[c] >= 128 , then data[c] - 128 is nonnegative, otherwise it is negative. 这里的技巧是，如果data[c] >= 128 ，那么data[c] - 128是非负的，否则它是负的。 The highest bit in an int , the sign bit, is 1 if and only if that number is negative. 当且仅当该数字为负时， int的最高位（符号位）为1。 >> is a shift that extends the sign bit, so shifting right by 31 makes the whole result 0 if it used to be nonnegative, and all 1 bits (which represents -1) if it used to be negative. >>是一个扩展符号位的移位，因此右移31会使整个结果为0（如果它曾经是非负的），而所有1位（代表-1）如果它曾经是负数。 So t is 0 if data[c] >= 128 , and -1 otherwise. 因此，如果data[c] >= 128则t为0 ，否则为-1 。 ~t switches these possibilities, so ~t is -1 if data[c] >= 128 , and 0 otherwise. ~t切换这些可能性，因此如果data[c] >= 128则~t为-1 ，否则为0 。

x & (-1) is always equal to x , and x & 0 is always equal to 0 . x & (-1)总是等于x ， x & 0总是等于0 。 So sum += ~t & data[c] increases sum by 0 if data[c] < 128 , and by data[c] otherwise. 因此，如果data[c] < 128 ，则sum += ~t & data[c]将sum加0 ，否则加上data[c] 。

Many of these tricks can be applied elsewhere. 其中许多技巧可以应用于其他地方。 This trick can certainly be generally applied to have a number be 0 if and only if one value is greater than or equal to another value, and -1 otherwise, and you can mess with it some more to get <= , < , and so on. 当且仅当一个值大于或等于另一个值时，这个技巧当然可以应用于数字为0 ，否则为-1 ，你可以更多地使用它来获得<= ， < ，等等上。 Bit twiddling like this is a common approach to making mathematical operations branch-free, though it's certainly not always going to be built out of the same operations; 这样的比特是一种使数学运算无分支的常用方法，尽管它肯定不会总是用相同的操作构建; ^ (xor) and | ^ （xor）和| (or) also come into play sometimes. （或）有时也会发挥作用。

Answer 2

While Louis Wasserman's answer is correct, I want to show you a more general (and much clearer) way to write branchless code. 虽然Louis Wasserman的回答是正确的，但我想向您展示一种更通用（更清晰）的方法来编写无分支代码。 You can just use ? : 你可以用? : ? : operator: ? :运营商：

    int t = data[c];
    sum += (t >= 128 ? t : 0);

JIT compiler sees from the execution profile that the condition is poorly predicted here. JIT编译器从执行配置文件中看到这里的条件预测不佳。 In such cases the compiler is smart enough to replace a conditional branch with a conditional move instruction: 在这种情况下，编译器足够聪明，可以用条件移动指令替换条件分支：

    mov    0x10(%r14,%rbp,4),%r9d  ; load R9d from array
    cmp    $0x80,%r9d              ; compare with 128
    cmovl  %r8d,%r9d               ; if less, move R8d (which is 0) to R9d

You can verify yourself that this version works equally fast for both sorted and unsorted array. 您可以验证此版本对已排序和未排序的数组的运行速度同样快。

Answer 3

Branchless code means typically evaluating all possible outcomes of a conditional statement with a weight from the set [0, 1], so that the Sum{ weight_i } = 1. Most of the calculations are essentially discarded. 无分支代码通常意味着使用集合[0,1]中的权重来评估条件语句的所有可能结果，以便Sum {weight_i} = 1.大多数计算基本上被丢弃。 Some optimization can result from the fact, that E_i doesn't have to be correct when the corresponding weight w_i (or mask m_i ) is zero. 一些优化可以由以下事实导致：当对应的权重w_i （或掩码m_i ）为零时， E_i不必是正确的。

  result = (w_0 * E_0) + (w_1 * E_1) + ... + (w_n * E_n)    ;; or
  result = (m_0 & E_0) | (m_1 & E_1) | ... | (m_n * E_n)

where m_i stands for a bitmask. 其中m_i代表位掩码。

Speed can be achieved also through parallel processing of E_i with a horizontal collapse. 通过水平折叠并行处理E_i也可以实现速度。

This is contradictory to the semantics of if (a) b; else c; 这与if (a) b; else c;的语义相矛盾if (a) b; else c; if (a) b; else c; or it's ternary shorthand a ? b : c 还是它的三元速记a ? b : c a ? b : c , where only one expression out of [b, c] is evaluated. a ? b : c ，其中仅评估[b，c]中的一个表达式。

Thus ternary operation is no magic bullet for branchless code. 因此，三元运算对于无分支代码来说不是神奇的子弹。 A decent compiler produces branchless code equally for 一个体面的编译器同样产生无分支代码

t = data[n];
if (t >= 128) sum+=t;

vs. 与

    movl    -4(%rdi,%rdx), %ecx
    leal    (%rax,%rcx), %esi
    addl    $-128, %ecx
    cmovge  %esi, %eax

Variations of branchless code include presenting the problem through other branchless non-linear functions, such as ABS, if present in the target machine. 无分支代码的变化包括通过其他无分支非线性函数（例如ABS）呈现问题（如果存在于目标机器中）。

eg 例如

 2 * min(a,b) = a + b - ABS(a - b),
 2 * max(a,b) = a + b + ABS(a - b)

or even: 甚至：

 ABS(x) = sqrt(x*x)      ;; caveat -- this is "probably" not efficient

In addition to << and ~ , it may be equally beneficial to use bool and !bool instead of (possibly undefined) (int >> 31). 除了<<和~ ，使用bool和!bool代替（可能是未定义的）（int >> 31）可能同样有益。 Likewise, if the condition evaluates as [0, 1], one can generate a working mask with negation: 同样，如果条件的计算结果为[0,1]，则可以生成带有否定的工作掩码：

-[0, 1] = [0, 0xffffffff]  in 2's complement representation

如何制作无网段代码？

问题描述

3 个解决方案

解决方案1
27 已采纳 2015-08-19 23:21:15

解决方案2
16 2015-08-19 23:52:25

解决方案3
9 2015-08-20 08:16:44

如何制作无网段代码？

问题描述

3 个解决方案

解决方案1 27 已采纳 2015-08-19 23:21:15

解决方案2 16 2015-08-19 23:52:25

解决方案3 9 2015-08-20 08:16:44

解决方案1
27 已采纳 2015-08-19 23:21:15

解决方案2
16 2015-08-19 23:52:25

解决方案3
9 2015-08-20 08:16:44