互锁交换<T>比 Interlocked.CompareExchange 慢<T> ?

Question

I came across some odd performance results when optimizing a program, which are shown in the following BenchmarkDotNet benchmark:我在优化程序时遇到了一些奇怪的性能结果，如下面的 BenchmarkDotNet 基准测试所示：

string _s, _y = "yo";

[Benchmark]
public void Exchange() => Interlocked.Exchange(ref _s, null);

[Benchmark]
public void CompareExchange() => Interlocked.CompareExchange(ref _s, _y, null);

The results are as follows:结果如下：

BenchmarkDotNet=v0.10.10, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.192)
Processor=Intel Core i7-6700HQ CPU 2.60GHz (Skylake), ProcessorCount=8
Frequency=2531248 Hz, Resolution=395.0620 ns, Timer=TSC
.NET Core SDK=2.1.4
  [Host]     : .NET Core 2.0.5 (Framework 4.6.26020.03), 64bit RyuJIT
  DefaultJob : .NET Core 2.0.5 (Framework 4.6.26020.03), 64bit RyuJIT

          Method |      Mean |     Error |    StdDev |
---------------- |----------:|----------:|----------:|
        Exchange | 20.525 ns | 0.4357 ns | 0.4662 ns |
 CompareExchange |  7.017 ns | 0.1070 ns | 0.1001 ns |

It would seem that Interlocked.Exchange is more than twice as slow as Interlocked.CompareExchange - which is confusing because it's supposed to be doing less work.看起来Interlocked.Exchange速度是Interlocked.CompareExchange两倍多 - 这令人困惑，因为它应该做的工作更少。 Unless I'm mistaken both are supposed be CPU ops.除非我弄错了，否则两者都应该是 CPU 操作。

Does anyone have a good explanation on why this could be happening?有没有人对为什么会发生这种情况有很好的解释？ Is this an actual performance difference in the CPU ops or some issue in the way .NET Core is wrapping them?这是 CPU 操作中的实际性能差异还是 .NET Core 包装它们的方式中的某些问题？

If this is the situation it seem best to simply avoid Interlocked.Exchange() and use Interlocked.CompareExchange() whenever possible?如果是这种情况，最好避免Interlocked.Exchange()并尽可能使用Interlocked.CompareExchange() ？

EDIT: Another odd thing: when I run the same benchmarks with int or long rather than string, I get more or less the same running time.编辑：另一件奇怪的事情：当我使用 int 或 long 而不是 string 运行相同的基准测试时，我或多或少地获得了相同的运行时间。 Also, I used BenchmarkDotNet's disassembler diagnoser to look at the actually assembly being generated, and found something interesting: with the int/long version I can clearly see xchg and cmpxchg instructions, but with strings I see call into the Interlocked.Exchange/Interlocked.CompareExchange methods...!此外，我使用 BenchmarkDotNet 的反汇编器诊断程序查看正在生成的实际程序集，并发现了一些有趣的东西：使用 int/long 版本我可以清楚地看到 xchg 和 cmpxchg 指令，但使用字符串我看到调用 Interlocked.Exchange/Interlocked。比较交换方法...！

EDIT2: Opened issue in coreclr: https://github.com/dotnet/coreclr/issues/16051 EDIT2：在 coreclr 中打开问题： https : //github.com/dotnet/coreclr/issues/16051

Answer 1

Following up on my commentaries, this seems to be an issue with the generic overload of Exchange .跟进我的评论，这似乎是Exchange通用过载的问题。

If you avoid the generic overload altogether (changing the type of _s and _y to object ), the performance difference disappears.如果您完全避免通用重载（将_s和_y的类型更改为object ），性能差异就会消失。

The question remains though as to why resolving to the generic overloads only slows down Exchange .问题仍然存在，为什么解析到通用重载只会减慢Exchange速度。 Reading through the Interlocked source code, it seems that a hack was implemented in CompareExchange<T> to make it faster. CompareExchange<T> Interlocked源代码，似乎在CompareExchange<T>实施了一个 hack 以使其更快。 Source code commentaries on CompareExchange<T> follow: CompareExchange<T>源代码注释如下：

 * CompareExchange<T>
 * 
 * Notice how CompareExchange<T>() uses the __makeref keyword
 * to create two TypedReferences before calling _CompareExchange().
 * This is horribly slow. Ideally we would like CompareExchange<T>()
 * to simply call CompareExchange(ref Object, Object, Object); 
 * however, this would require casting a "ref T" into a "ref Object", 
 * which is not legal in C#.
 * 
 * Thus we opted to cheat, and hacked to JIT so that when it reads
 * the method body for CompareExchange<T>() it gets back the
 * following IL:
 *
 *     ldarg.0 
 *     ldarg.1
 *     ldarg.2
 *     call System.Threading.Interlocked::CompareExchange(ref Object, Object, Object)
 *     ret
 *
 * See getILIntrinsicImplementationForInterlocked() in VM\JitInterface.cpp
 * for details.

Nothing similar is commented in Exchange<T> and it also makes use of the "horribly slow" __makeref so this could be the reason why you are seeing this unexpected behavior. Exchange<T>没有任何类似的评论，并且它还使用了“非常慢”的__makeref因此这可能是您看到这种意外行为的原因。

All this is of course my interpretation, you'd actually need someone of the .NET team to really confirm my suspicions.所有这些当然是我的解释，您实际上需要 .NET 团队的某个人来真正证实我的怀疑。

Answer 2

This has now been fixed on newer versions of .Net Core:现在已在较新版本的 .Net Core 上修复此问题：

https://github.com/dotnet/coreclr/issues/16051 https://github.com/dotnet/coreclr/issues/16051

互锁交换<T>比 Interlocked.CompareExchange 慢<T> ?

问题描述

2 个解决方案

解决方案1
7 2018-01-26 23:08:15

解决方案2
0 2020-01-19 07:29:40

互锁交换<T>比 Interlocked.CompareExchange 慢<T> ?

问题描述

2 个解决方案

解决方案1 7 2018-01-26 23:08:15

解决方案2 0 2020-01-19 07:29:40

解决方案1
7 2018-01-26 23:08:15

解决方案2
0 2020-01-19 07:29:40