简体   繁体   中英

Experienced strange rdtsc behavior comparing physical hardware and kvm-based VMs

I have a following problem. I run several stress tests on a Linux machine

$ uname -a
Linux debian 3.14-2-686-pae #1 SMP Debian 3.14.15-2 (2014-08-09) i686 GNU/Linux

It's an Intel i5 Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz, 8 G RAM, 300 G HDD.

These tests are not I/O intensive, I mostly compute double arithmetic in the following way:

start = rdtsc();
do_arithmetic();
stop = rdtsc();
diff = stop - start;

I repeat these tests many times, running my benchmarking application on a physical machine or on a KVM based VM:

qemu-system-i386 disk.img -m 2000 -device virtio-net-pci,netdev=net1,mac=52:54:00:12:34:03 -netdev type=tap,id=net1,ifname=tap0,script=no,downscript=no -cpu host,+vmx -enable-kvm -nographichere

I collect data statistics (ie, diffs) for many trials. For the physical machine (not loaded), I get the data distribution of processing delay mostly likely to be a very narrow lognormal.

When I repeat the experiment on the virtual machine (physical and virtual machines are not loaded), the lognormal distribution is still there (of a little bit wider shape), however, I collect a few points with completion times much shorter (about two times) than the absolute minimum gathered for the physical machine!! (Notice that the completion time distribution on the physical machine is very narrow lying close to the min value). Also there are some points with completion times much longer than the average completion time on the hardware machine.

I guess that my rdtsc benchmarking method is not very accurate for the VM environment. Can you please suggest a method to improve my benchmarking system that could provide reliable (comparable) statistics between the physical and the kvm-based virtual environment? At least something, that won't show me that the VM is 2x faster than a hardware PC in a small number of cases.

Thanks in advance for any suggestions or comments on this subject.

Best regards

也许您可以尝试使用clock_gettime(CLOCK_THREAD_CPUTIME_ID,&ts) ,有关更多信息,请参见man clock_gettime

It seems that it's not the problem of rdtsc at all. I am using my i5 Intel core with a fixed limited frequency through the acpi_cpufreq driver with the userspace governor. Even though the CPU speed is fixed at let's say 2.4 G (out of 3.3G), there are some calculations performed with the maximum speed of 3.3 G. Roughly speaking, I also encountered a very small number of such cases on the physical machine ~1 per 10000. On kvm, this behavior is of higher frequency, let's say about a few percent. I will further investigate this problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM