简体   繁体   中英

log10() performance on Visual Studio 2015 a lot slower than Visual Studio 2013 for x86

We have ported a VS2013 C++/MFC application to VS2015 and are having some rather disturbing issues with the performance and code generated by the VS2015 compiler.

Note this is for x86.

It is magnitudes slower on log10() calls. When profiling a Release build using CPU sampling, we see that these calls take up a lot more time than they did before. Going from eg 49 samples on the same run for VS2013 to a whopping 7545 samples for the same run in VS2015. This means this function goes from 0.6% of CPU load to 50% for the application in question.

In VS2013 profiler shows:

Function Name   Inclusive Samples   Exclusive Samples   Inclusive Samples % Exclusive Samples %
__libm_sse2_log10   49  49  0.61    0.61

In VS2015 profiler shows:

Function Name   Inclusive Samples   Exclusive Samples   Inclusive Samples % Exclusive Samples %
___sse2_log102  7,545   7,545   50.43   50.43

Why a different function name?

We have looked briefly at the generated assembly for log10. On VS2013 this forwards to disp_pentium4.inc and log10_pentium4.asm . On VS2015 this is different. It seems VS2015 goes back to __libm_sse2_log10 in Debug.

Could the __sse2_log102 be the cause of this performance difference alone? We have checked that results output from functions calling these are within expected floating point differences.

We are compiling with target v140_xp and have the following compile options:

/Yu"stdafx.h" /MP /GS- /GL /analyze- /W4 /wd"4510" /wd"4610" /Zc:wchar_t /Z7 /Gm- /Ox /Ob2 /Zc:inline /fp:fast /D "WINVER=0x0501" /D "WIN32" /D "_WINDOWS" /D "NDEBUG" /D "_CRT_SECURE_NO_WARNINGS" /D "_CRT_SECURE_NO_DEPRECATE" /D "_SCL_SECURE_NO_WARNINGS" /D "_USING_V110_SDK71_" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /GR /arch:SSE2 /Gd /Oy /Oi /MT 

Also shown here when viewing properties:

优化

代码生成

All project settings are the same for both VS2013 and VS2015. Note we are using SSE2 and have floating point model set to fast.

Has anyone encountered the same issue and know how to fix this?

Here my comment as an answer.

It appears that VS2015 has changed the implementation of log10 in release builds, where it calls this new __sse2_log102 function instead of the old __libm_sse2_log10 and that this new implementation is the cause of a huge performance difference.

The fix for us in this case was to call an implementation available in Intels Performance Primitives (IPP) library. Eg instead of calling:

return log10(v);

Call this instead:

double result;
ippsLog10_64f_A53(&v, &result, 1);
return result;

This resulted in the performance issue to disappear, in fact it was slightly faster using an old IPP 7.0 release. Not all can use and pay for IPP, though, so we hope Microsoft fixes this.

Below is the version of VS2015 that has shown this issue.

在此输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM