简体繁体 English

为什么我们的软件在虚拟化条件下运行得这么慢？

[英]Why would our software run so much slower under virtualization?

原文 2011-04-15 20:30:18 5 2 c/ windows/ profiling/ virtualization/ verysleepy

I'm trying to figure out why our software runs so much slower when run under virtualization. 我试图弄清楚为什么我们的软件在虚拟化条件下运行时会这么慢。 Most of the stats I've seen, say it should be only a 10% performance penalty in the worst case, but on a Windows virtual server, the performance penalty can is 100-400%. 我见过的大多数统计数据都说，在最坏的情况下，性能损失应该只有10％，但是在Windows虚拟服务器上，性能损失可能是100-400％。 I've been trying to profile the differences, but the profile results don't make a lot of sense to me. 我一直在尝试剖析差异，但是剖析结果对我而言并没有多大意义。 Here's what I see when I profile on my Vista 32-bit box with no virtualization: 在没有虚拟化的Vista 32位设备上进行配置时，会看到以下内容： 在此处输入图片说明

And here's one run on a Windows 2008 64-bit server with virtualization: 这是在具有虚拟化功能的Windows 2008 64位服务器上运行的： 在此处输入图片说明

The slow one is spending a very large amount of it's time in RtlInitializeExceptionChain which shows as 0.0s on the fast one. 速度较慢的一个正在RtlInitializeExceptionChain花费大量时间，速度较慢的一个显示为0.0s。 Any idea what that does? 知道那是什么吗？ Also, when I attach to the process my machine, there is only a single thread, PulseEvent however when I connect on the server, there are two threads, GetDurationFormatEx and RtlInitializeExceptionChain . 另外，当我将进程附加到机器上时，只有一个线程PulseEvent但是当我在服务器上进行连接时，则有两个线程GetDurationFormatEx和RtlInitializeExceptionChain 。 As far as I know, the code as we've written in uses only a single thread. 据我所知，我们编写的代码仅使用一个线程。 Also, for what it's worth this is a console only application written in pure C with no UI at all. 另外，值得这样做的是一个纯控制台编写的纯C控制台应用程序，完全没有UI。

Can anybody shed any light on any of this for me? 谁能为我阐明任何这些？ Even just information on what some of these ntdll and kernel32 calls are doing? 甚至只是有关这些ntdll和kernel32调用正在执行的操作的信息？ I'm also unsure how much of the differences are 64/32-bit related and how many are virtual/not-virtual related. 我也不确定与64/32位相关的差异有多少，与虚拟/非虚拟相关的差异有多少。 Unfortunately, I don't have easy access to other configurations to determine the difference. 不幸的是，我无法轻松访问其他配置来确定差异。

2 个解决方案

I suppose we could divide reasons for slower performance on a virtual machine into two classes: 我想我们可以将导致虚拟机性能降低的原因分为两类：

1. Configuration Skew 1.配置偏斜

This category is for all the things that have nothing to do with virtualization per se but where the configured virtual machine is not as good as the real one. 此类别适用于与虚拟化本身无关的所有事物，但是配置的虚拟机不如真实虚拟机好。 A really easy thing to do is to give the virtual machine just one CPU core and then compare it to an application running on a 2-CPU 8-core 16-hyperthread Intel Core i7 monster. 一件很容易做的事就是给虚拟机一个CPU核心，然后将其与运行于2 CPU 8核16超线程Intel Core i7怪物上的应用程序进行比较。 In your case, at a minimum you did not run the same OS. 就您而言，至少您没有运行相同的操作系统。 Most likely there is other skew as well. 最可能还有其他偏斜。

2. Bad Virtualization Fit 2.虚拟化不适合

Things like databases that do a lot of locking do not virtualize well and so the typical overhead may not apply to the test case. 像数据库这样的事情需要大量的锁定，因此无法很好地虚拟化，因此典型的开销可能不适用于测试用例。 It's not your exact case, but I've been told the penalty is 30-40% for MySQL. 这不是您的确切情况，但有人告诉我MySQL的罚款是30-40％。 I notice an entry point called ...semaphore in your list. 我注意到您的列表中有一个名为... semaphore的入口点。 That's a sign of something that will virtualize slowly. 这表明某些东西会慢慢虚拟化。

The basic problem is that constructs that can't be executed natively in user mode will require traps (slow, all by themselves) and then further overhead in hypervisor emulation code. 基本问题是，无法在用户模式下本机执行的构造将需要陷阱（缓慢，全部靠自身），然后需要虚拟机管理程序仿真代码中的更多开销。

I'm assuming that you're providing enough resources for your virtual machines, the benefit of virtualization is consolidating 5 machines that only run at 10-15% CPU/memory onto a single machine that will run at 50-75% CPU/memory and which still leaves you 25-50% overhead for those "bursty" times. 我假设您为虚拟机提供了足够的资源，虚拟化的好处是将仅以10-15％CPU /内存运行的5台计算机合并到将以50-75％CPU /内存运行的单台计算机而在那些“突发”时期，您仍然需要25-50％的开销。

Personal anecdote: 20 machines were virtualized but each was using as much CPU as it could. 个人轶事：对 20台计算机进行了虚拟化，但是每台计算机都使用了尽可能多的CPU。 This caused problems when a single machine was trying to use more power than a single core could provide. 当一台机器试图使用比单核所能提供的功率更多的功率时，这会引起问题。 Therefore the hypervisor was virtualizing a single core over multiple cores, killing performance. 因此，系统管理程序正在虚拟化多个内核上的单个内核，从而降低了性能。 Once we throttled the CPU usage of each VM to the maximum available from any single core, performance skyrocketed. 一旦我们将每个VM的CPU使用量限制到任何单个内核都可以提供的最大使用量，性能就会急剧上升。