为什么这种使用条件运算符的方式会提供比正常if / else更好的性能

Question

有3个代码执行相同的操作，但它们的性能在x64版本中有所不同。

我想这是因为分支预测 。 任何人都可以进一步阐述？

条件： 需要41毫秒

for (int j = 0; j < 10000; j++)
{
    ret = (j * 11 / 3 % 5) + (ret % 11 == 4 ? 2 : 1);
}

正常： 需要51毫秒

for (int j = 0; j < 10000; j++)
{
    if (ret % 11 == 4)
    {
        ret = 2 + (j * 11 / 3 % 5);
    }
    else
    {
        ret = 1 + (j * 11 / 3 % 5);
    }
}

缓存： 需要44毫秒

for (int j = 0; j < 10000; j++)
{
    var tmp = j * 11 / 3 % 5;
    if (ret % 11 == 4)
    {
        ret = 2 + tmp;
    }
    else
    {
        ret = 1 + tmp;
    }
}

Answer 1

编辑3如果我在修正定时误差的情况下返回原始测试，我会得到与此类似的输出。

有条件的花了67ms

正常需要83毫秒

缓存耗时73毫秒

这表明三元/条件运算符在for循环中可以稍微快一些。 鉴于先前的发现，当逻辑分支从循环中抽象出来时， if块胜过Ternary / Conditional运算符，我们可以推断编译器能够在迭代使用Conditional / Ternary运算符时进行额外的优化，在至少有些情况。

我不清楚为什么这些优化不适用于标准if块，也不适用于标准if块。 我认为，实际差异是相当小的，这是一个有争议的问题。

编辑2

这里突出显示的测试代码中存在明显的错误

该Stopwatch不调用之间重置，当我使用Stopwatch.Restart代替Stopwatch.Start和最多的迭代1000000000，我得到的结果

有条件的花了22404ms

正常需要21403ms

这更像是我期望并通过提取的CIL得到的结果。 因此，当与周围代码隔离时，“正常” if实际上比Ternary \\ Conditional运算符略快。

编辑

我的调查，下面列出之后，我会建议使用的逻辑条件两个常量之间进行选择或文字时，有条件/三元操作符可以显著快于标准if块。 ~~在我的测试中，它大约快了两倍。~~

~~但是，我无法理解为什么。~~ 正常if产生的CIL更长，但对于这两个函数，平均执行路径似乎是六行，包括3次加载和1或2次跳转~~，任何想法？~~ 。

使用此代码，

using System.Diagnostics;

class Program
{
    static void Main()
    {
        var stopwatch = new Stopwatch();

        var conditional = Conditional(10);
        var normal = Normal(10);
        var cached = Cached(10);

        if (new[] { conditional, normal }.Any(x => x != cached))
        {
            throw new Exception();
        }

        stopwatch.Start();
        conditional = Conditional(10000000);
        stopWatch.Stop();
        Console.WriteLine(
            "Conditional took {0}ms", 
            stopwatch.ElapsedMilliseconds);

        ////stopwatch.Start(); incorrect
        stopwatch.Restart();
        normal = Normal(10000000);
        stopWatch.Stop();
        Console.WriteLine(
            "Normal took {0}ms", 
            stopwatch.ElapsedMilliseconds);

        ////stopwatch.Start(); incorrect
        stopwatch.Restart();
        cached = Cached(10000000);
        stopWatch.Stop();
        Console.WriteLine(
            "Cached took {0}ms", 
            stopwatch.ElapsedMilliseconds);

        if (new[] { conditional, normal }.Any(x => x != cached))
        {
            throw new Exception();
        }

        Console.ReadKey();
    }

    static int Conditional(int iterations)
    {
        var ret = 0;
        for (int j = 0; j < iterations; j++)
        {
            ret = (j * 11 / 3 % 5) + (ret % 11 == 4 ? 2 : 1);
        }

        return ret;
    }

    static int Normal(int iterations)
    {
        var ret = 0;
        for (int j = 0; j < iterations; j++)
        {
            if (ret % 11 == 4)
            {
                ret = 2 + (j * 11 / 3 % 5);
            }
            else
            {
                ret = 1 + (j * 11 / 3 % 5);
            }
        }

        return ret;
    }

    static int Cached(int iterations)
    {
        var ret = 0;
        for (int j = 0; j < iterations; j++)
        {
            var tmp = j * 11 / 3 % 5;
            if (ret % 11 == 4)
            {
                ret = 2 + tmp;
            }
            else
            {
                ret = 1 + tmp;
            }
        }

        return ret;
    }
}

~~在x64发布模式下编译，具有优化功能，无需附加调试器即可运行。~~ ~~我得到这个输出，~~

~~有条件的花了65ms~~

~~正常需要148ms~~

~~缓存耗时217ms~~

~~并且没有抛出异常。~~

使用ILDASM来反汇编代码我可以确认三种方法的CIL不同， Conditional方法的代码有点短。

要真正回答“为什么”的问题，我需要了解编译器的代码。 我可能需要知道为什么编译器是这样编写的。

你甚至可以进一步细分下来，这样你实际上只是比较逻辑功能，而忽略其他所有的活动。

static int Conditional(bool condition, int value)
{
    return value + (condition ? 2 : 1);
}

static int Normal(bool condition, int value)
{
    if (condition)
    {
        return 2 + value;
    }

    return 1 + value;
}

哪个你可以迭代

static int Looper(int iterations, Func<bool, int, int> operation)
{
    var ret = 0;
    for (var j = 0; j < iterations; j++)
    {
        var condition = ret % 11 == 4;
        var value = ((j * 11) / 3) % 5;
        ret = operation(condition, value);
    }
}

此测试仍显示性能差异， 但现在另一方面 ，简化了IL。

... Conditional ...
{
     : ldarg.1      // push second arg
     : ldarg.0      // push first arg
     : brtrue.s T   // if first arg is true jump to T
     : ldc.i4.1     // push int32(1)
     : br.s F       // jump to F
    T: ldc.i4.2     // push int32(2)
    F: add          // add either 1 or 2 to second arg
     : ret          // return result
}

... Normal ...
{
     : ldarg.0      // push first arg
     : brfalse.s F  // if first arg is false jump to F
     : ldc.i4.2     // push int32(2)
     : ldarg.1      // push second arg
     : add          // add second arg to 2
     : ret          // return result
    F: ldc.i4.1     // push int32(1)
     : ldarg.1      // push second arg
     : add          // add second arg to 1
     : ret          // return result
}

Answer 2

有3个代码执行相同的操作，但它们的性能不同

那不是那么令人惊讶，是吗？ 写一些不同的东西，你得到不同的时间。

我想这是因为分支预测。

这可以解释，为什么第一个片段更快。 但请注意?:仍然是分支。
另外需要注意的是，它只是一个大表达式，是优化器的理想区域。

问题是你无法查看这样的代码并得出某个运算符更快/更慢的结论。 周围的代码至少同样重要。

为什么这种使用条件运算符的方式会提供比正常if / else更好的性能

问题描述

2 个解决方案

解决方案1
2 2012-09-17 11:37:56

解决方案2
1 2012-09-17 11:55:42

为什么这种使用条件运算符的方式会提供比正常if / else更好的性能

问题描述

2 个解决方案

解决方案1 2 2012-09-17 11:37:56

解决方案2 1 2012-09-17 11:55:42

解决方案1
2 2012-09-17 11:37:56

解决方案2
1 2012-09-17 11:55:42