简体   繁体   English

确保编译器始终使用SSE sqrt指令

[英]Ensure compiler always use SSE sqrt instruction

I'm trying to get GCC (or clang) to consistently use the SSE instruction for sqrt instead of the math library function for a computationally intensive scientific application. 我正在尝试让GCC(或clang)始终对sqrt使用SSE指令而不是计算密集型科学应用程序的数学库函数。 I've tried a variety of GCCs on various 32 and 64 bit OS X and Linux systems. 我在各种32位和64位OS X和Linux系统上尝试了各种GCC。 I'm making sure to enable sse with -mfpmath=sse (and -march=core2 to satisfy GCCs requirement to use -mfpmath=sse on 32 bit). 我确保使用-mfpmath = sse启用sse(和-march = core2以满足GCC要求在32位上使用-mfpmath = sse)。 I'm also using -O3. 我也在使用-O3。 Depending on the GCC or clang version, the generated assembly doesn't consistently use SSE's sqrtss. 根据GCC或clang版本,生成的程序集不会始终使用SSE的sqrtss。 In some versions of GCC, all the sqrts use the instruction. 在某些版本的GCC中,所有sqrts都使用该指令。 In others, there is mixed usage of sqrtss and calling the math library function. 在其他情况下,sqrtss混合使用并调用数学库函数。 Is there a way to give a hint or force the compiler to only use the SSE instruction? 有没有办法给出提示或强制编译器只使用SSE指令?

使用sqrtss内在__builtin_ia32_sqrtss

You should be carefull in using that, you probably know that it has less precicision. 你应该小心使用它,你可能知道它的精确度较低。 That will be the reason that gcc doesn't use it systematically. 这就是gcc没有系统地使用它的原因。

There is a trick that is even mentionned in INTEL's SSE manual (I hope that I remember correctly). 在英特尔的SSE手册中甚至提到了一个技巧(我希望我没记错)。 The result of sqrtss is only one Heron iteration away from the target. sqrtss的结果只是远离目标的一次Heron迭代。 Maybe that gcc is sometimes able to inline that surrounding brief iteration at some point (versions) and for others it doesn't. 也许gcc有时能够在某些时候(版本) inline周围的简短迭代,而对于其他人则不能。

You could use the builtin as MSN says, but you should definitively look up the specs on INTEL's web site to know what you are trading. 您可以使用内置的MSN说,但您应该明确查看INTEL网站上的规格,以了解您的交易。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM