[英]C# Compiler Optimizations
I'm wondering if someone can explain to me what exactly the compiler might be doing for me to observe such extreme differences in performance for a simple method. 我想知道是否有人可以向我解释一下编译器可能正在为我做些什么来观察一个简单方法的性能差异。
public static uint CalculateCheckSum(string str) {
char[] charArray = str.ToCharArray();
uint checkSum = 0;
foreach (char c in charArray) {
checkSum += c;
}
return checkSum % 256;
}
I'm working with a colleague doing some benchmarking/optimizations for a message processing application. 我正在和一位同事一起为消息处理应用程序做一些基准测试/优化。 Doing 10 million iterations of this function using the same input string took about 25 seconds in Visual Studio 2012, however when the project was built using the "Optimize Code" option turned on the same code executed in 7 seconds for the same 10 million iterations.
在Visual Studio 2012中使用相同的输入字符串执行此函数的1000万次迭代大约需要25秒,但是当使用“优化代码”选项构建项目时,打开相同的代码,在7秒内执行相同的1000万次迭代。
I'm very interested to understand what the compiler is doing behind the scenes for us to be able to see a greater than 3x performance increase for a seemingly innocent block of code such as this. 我非常有兴趣了解编译器在幕后做了什么,以便能够看到像这样看似无辜的代码块的性能提升超过3倍。
As requested, here is a complete Console application that demonstrates what I am seeing. 根据要求,这是一个完整的控制台应用程序,演示我所看到的。
class Program
{
public static uint CalculateCheckSum(string str)
{
char[] charArray = str.ToCharArray();
uint checkSum = 0;
foreach (char c in charArray)
{
checkSum += c;
}
return checkSum % 256;
}
static void Main(string[] args)
{
string stringToCount = "8=FIX.4.29=15135=D49=SFS56=TOMW34=11752=20101201-03:03:03.2321=DEMO=DG00121=155=IBM54=138=10040=160=20101201-03:03:03.23244=10.059=0100=ARCA10=246";
Stopwatch stopwatch = Stopwatch.StartNew();
for (int i = 0; i < 10000000; i++)
{
CalculateCheckSum(stringToCount);
}
stopwatch.Stop();
Console.WriteLine(stopwatch.Elapsed);
}
}
Running in debug with Optimization off I see 13 seconds, on I get 2 seconds. 在调试中运行优化关闭我看到13秒,我得到2秒。
Running in Release with Optimization off 3.1 seconds and on 2.3 seconds. 在发布中运行,优化时间为3.1秒和2.3秒。
To look at what the C# compiler does for you, you need to look at the IL. 要查看C#编译器为您执行的操作,您需要查看IL。 If you want to see how that affects the JITted code, you'll need to look at the native code as described by Scott Chamberlain.
如果你想看看它如何影响JITted代码,你需要查看Scott Chamberlain所描述的本机代码。 Be aware that the JITted code will vary based on processor architecture, CLR version, how the process was launched, and possibly other things.
请注意,JITted代码将根据处理器体系结构,CLR版本,进程的启动方式以及可能的其他内容而有所不同。
I would usually start with the IL, and then potentially look at the JITted code. 我通常会从IL开始,然后可能会查看JITted代码。
Comparing the IL using ildasm
can be slightly tricky, as it includes a label for each instruction. 使用
ildasm
比较IL可能有点棘手,因为它包含每条指令的标签。 Here are two versions of your method compiled with and without optimization (using the C# 5 compiler), with extraneous labels (and nop
instructions) removed to make them as easy to compare as possible: 以下是使用和不使用优化(使用C#5编译器)编译的方法的两个版本,删除了无关标签(和
nop
指令)以使它们尽可能易于比较:
Optimized 优化
.method public hidebysig static uint32
CalculateCheckSum(string str) cil managed
{
// Code size 46 (0x2e)
.maxstack 2
.locals init (char[] V_0,
uint32 V_1,
char V_2,
char[] V_3,
int32 V_4)
ldarg.0
callvirt instance char[] [mscorlib]System.String::ToCharArray()
stloc.0
ldc.i4.0
stloc.1
ldloc.0
stloc.3
ldc.i4.0
stloc.s V_4
br.s loopcheck
loopstart:
ldloc.3
ldloc.s V_4
ldelem.u2
stloc.2
ldloc.1
ldloc.2
add
stloc.1
ldloc.s V_4
ldc.i4.1
add
stloc.s V_4
loopcheck:
ldloc.s V_4
ldloc.3
ldlen
conv.i4
blt.s loopstart
ldloc.1
ldc.i4 0x100
rem.un
ret
} // end of method Program::CalculateCheckSum
Unoptimized 未优化
.method public hidebysig static uint32
CalculateCheckSum(string str) cil managed
{
// Code size 63 (0x3f)
.maxstack 2
.locals init (char[] V_0,
uint32 V_1,
char V_2,
uint32 V_3,
char[] V_4,
int32 V_5,
bool V_6)
ldarg.0
callvirt instance char[] [mscorlib]System.String::ToCharArray()
stloc.0
ldc.i4.0
stloc.1
ldloc.0
stloc.s V_4
ldc.i4.0
stloc.s V_5
br.s loopcheck
loopstart:
ldloc.s V_4
ldloc.s V_5
ldelem.u2
stloc.2
ldloc.1
ldloc.2
add
stloc.1
ldloc.s V_5
ldc.i4.1
add
stloc.s V_5
loopcheck:
ldloc.s V_5
ldloc.s V_4
ldlen
conv.i4
clt
stloc.s V_6
ldloc.s V_6
brtrue.s loopstart
ldloc.1
ldc.i4 0x100
rem.un
stloc.3
br.s methodend
methodend:
ldloc.3
ret
}
Points to note: 注意事项:
blt.s
rather than clt
followed by brtrue.s
when checking whether or not to go round the loop again (this is the reason for one of the extra locals). blt.s
而不是clt
后跟brtrue.s
当检查是否再次绕过循环时(这是其中一个额外本地人的原因)。 To get a good understanding, you should look at the IL code generated. 为了更好地理解,您应该查看生成的IL代码。
Compile the assembly, then make a copy of it and compile again with the optimizations. 编译程序集,然后复制它并使用优化再次编译。 Then open both assemblies in .net reflector and compare the difference of the compiled IL.
然后打开.net反射器中的两个组件并比较编译的IL的差异。
Update: Dotnet Reflector is available at http://www.red-gate.com/products/dotnet-development/reflector/ 更新:Dotnet Reflector可在http://www.red-gate.com/products/dotnet-development/reflector/上找到
Update 2: IlSpy seems like a good open source alternative. 更新2:IlSpy似乎是一个很好的开源替代品。 http://ilspy.net/
http://ilspy.net/
I don't know what optimizations it is doing but I can show you how you can find out for your self. 我不知道它正在做什么优化,但我可以告诉你如何找到自己。
First build your code optimized and start it without the debugger attached (the JIT compiler will generate different code if the debugger is attached). 首先构建优化的代码并在没有附加调试器的情况下启动它(如果连接了调试器,JIT编译器将生成不同的代码)。 Run your code so that you know that section was entered at least once so the JIT Compiler had a chance to process it and in Visual Studio go to
Debug->Attach To Process...
. 运行您的代码,以便您知道该部分至少输入一次,以便JIT编译器有机会处理它,并在Visual Studio中转到
Debug->Attach To Process...
From the new menu choose your running application. 从新菜单中选择正在运行的应用程序。
Put a breakpoint in the spot you are wondering about and let the program stop, once stopped go to Debug->Windows->Dissasembly
. 在您想知道的位置放置一个断点并让程序停止,一旦停止,请转到
Debug->Windows->Dissasembly
。 That will show you the compiled code the JIT created and you will be able to inspect what it is doing. 这将向您展示JIT创建的已编译代码,您将能够检查它正在做什么。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.