简体繁体 English

内置式效率

[英]Built-in type efficiency

原文 2014-10-31 21:09:31 2 3 c++/ performance/ built-in

Under The most efficient types second here 在最有效的类型第二位置

...and when defining an object to store a floating point number, use the double type, ... The double type is two to three times less efficient than the float type... ...当定义一个存储浮点数的对象时，使用double类型，... double类型比float类型效率低两到三倍......

Seems like it's contradicting itself? 好像它自相矛盾？

And I read elsewhere (can't remember where) that computations involving ints are faster than shorts on many machines because they are converted to ints to perform the operations? 我读到其他地方（不记得在哪里）涉及整数的计算比许多机器上的短路更快，因为它们被转换为整数以执行操作？ Is this true? 这是真的？ Any links on this? 有关于此的任何链接？

3 个解决方案

One can always argue about the quality of the contents on the site you link to. 人们总是可以争论您链接到的网站上的内容的质量。 But the two quotes you refer to: 但是你引用的两个引用：

...and when defining an object to store a floating point number, use the double type, ... ...并且在定义存储浮点数的对象时，使用double类型，...

and 和

... The double type is two to three times less efficient than the float type... ...双重类型比浮动类型效率低两到三倍......

Refer to two different things, the first hints that using doubles will give much less problems due to the increased precision, while the other talks about performance. 请参考两个不同的东西，第一个暗示使用双精度会由于精度提高而提供更少的问题，而另一个提到性能。 But honestly I wouldn't pay too much attention to that, chance is that if your code performs suboptimal it is due to incorrect choice of algorithm rather than wrong choice of primitive data type. 但老实说，我不会过分关注它，如果你的代码执行不理想，可能是由于算法的选择不正确而不是错误选择原始数据类型。

Here is a quote about performance comparison of single and double precision floats from one of my old teachers: Agner Fog, who has a lot of interesting reads over at his website: http://www.agner.org about software optimizations, if you are really interested in micro optimizations go take a look at it: 以下是关于我的一位老教师的单精度浮标和双精度浮标的性能比较的引用：Agner Fog，他在他的网站上有很多有趣的读物： http ：//www.agner.org，关于软件优化，如果你真的对微观优化很感兴趣看看它：

In most cases, double precision calculations take no more time than single precision. 在大多数情况下，双精度计算不会花费比单精度更多的时间。 When the floating point registers are used, there is simply no difference in speed between single and double precision. 当使用浮点寄存器时，单精度和双精度之间的速度没有差别。 Long double precision takes only slightly more time. 长双精度只需要稍微多一点时间。 Single precision division, square root and mathematical functions are calculated faster than double precision when the XMM registers are used, while the speed of addition, subtraction, multiplication, etc. is still the same regardless of precision on most processors (when vector operations are not used). 当使用XMM寄存器时，单精度除法，平方根和数学函数的计算速度比双精度快，而加法，减法，乘法等的速度仍然相同，无论大多数处理器的精度如何（当向量运算不是时）用过的）。

source: http://agner.org/optimize/optimizing_cpp.pdf 来源： http ： //agner.org/optimize/optimizing_cpp.pdf

While there might be different variations for different compilers, and different processors, the lesson one should learn from it, is that most likely you do not need to worry about optimizations at this level, look at choice of algorithm, even data container, not the primitive data type. 虽然不同的编译器和不同的处理器可能会有不同的变化，但是应该从中吸取教训，最有可能的是你不需要担心这个级别的优化，看看算法的选择，甚至是数据容器，而不是原始数据类型。

These optimizations are negligible unless you are writing software for space shuttle launches (which recently have not been doing too well). 除非您正在编写用于航天飞机发射的软件（最近一直没有做得太好），否则这些优化可以忽略不计。 Correct code is far more important than fast code. 正确的代码远比快速代码重要。 If you require the precision, using doubles will barely affect the run time. 如果您需要精度，使用双精度几乎不会影响运行时间。

Things that affect execution time way more than type definitions: 影响执行时间的方式不仅仅是类型定义：

Complexity - The more work there is to do, the more slowly the code will run. 复杂性 - 要做的工作越多，代码运行的速度就越慢。 Reduce the amount of work needed, or break it up into smaller, faster tasks. 减少所需的工作量，或将其分解为更小，更快的任务。
Repetition - Repetition can often be avoided and will inevitably ruin code performance. 重复 - 通常可以避免重复，并且不可避免地会破坏代码性能。 It comes in many guises-- for example, failing to cache the results of expensive calculations or of remote procedure calls. 它有许多伪装 - 例如，无法缓存昂贵计算或远程过程调用的结果。 Every time you recompute, you waste efficiency. 每次重新计算时，都会浪费效率。 They also extend the executable size. 它们还扩展了可执行文件大小。
Bad Design - Self explanatory. 糟糕的设计 - 自我解释。 Think before you code! 在编码之前先想想！
I/O - A program whose execution is blocked waiting for input or output (to and from the user, the disk, or a network connection) is bound to perform badly. I / O - 执行被阻止等待输入或输出（进出用户，磁盘或网络连接）的程序必然会严重执行。

There are many more reasons, but these are the biggest. 还有很多原因，但这些是最大的原因。 Personally, bad design is where I've seen most of it happen. 就个人而言，糟糕的设计是我见过的大部分时间。 State machines that could have been stateless, dynamic allocation where static would have been fine, etc. are the real problems. 状态机可能是无状态的，动态分配静态就好了，等等都是真正的问题。

Depending on the hardware, the actual CPU (or FPU if you like) performance of double is somewhere between half the speed and same speed on modern CPU's [for example add or subtract is probably same speed, multiply or divide may be different for larger type], when compared to float . 取决于硬件，实际的CPU（或FPU，如果你喜欢）的性能double某处一半的速度和现代CPU的[例如增加或减少相同的速度之间可能是相同的速度，乘或除可能是更大的类型不同]，与float相比。

On top of that, there are "fewer per cache-line", so if when there is a large number of them, it gets slower still because memory speed is slower. 最重要的是，“每个缓存行数较少”，因此如果存在大量缓存行，则由于内存速度较慢，速度会变慢。 Per cache-line, there are half as many double values -> about half the performance if the application is fully memory bound. 每个缓存行， double值的一半 - >如果应用程序完全受内存限制，则大约是性能的一半。 It will be much less of a factor in a CPU-bound application. 它在CPU绑定应用程序中的影响要小得多。

Similarly, if you use SSE or similar SIMD technologies, the double will take up twice as much space, so the number of actual calculation with be half as many "per instruction", and typically, the CPU will allow the same number of instructions per cycle for both float and double - except for some operations that take longer for double . 类似地，如果您使用SSE或类似的SIMD技术， double将占用两倍的空间，因此实际计算的数量是“每条指令”的一半，并且通常，CPU将允许每个指令的相同数量的指令float和double循环 - 除了一些需要更长时间的double 。 Again, leading to about half the performance. 再次，导致大约一半的性能。

So, yes, I think the page in the link is confusing and mixing up the ideal performance setup between double and float . 所以，是的，我认为链接中的页面令人困惑，混淆了double和float之间的理想性能设置。 That is, from a pure performance perspective. 也就是说，从纯粹的性能角度来看。 It is often much easier to get noticeable calculation errors when using float - which can be a pain to track down - so starting with double and switching to float if it's deemed necessary because you have identified it as a performance issue (either from experience or measurements). 使用float时通常会更容易得到明显的计算错误 - 这可能很难追踪 - 因此从double开始并切换到float如果认为有必要，因为您已将其识别为性能问题（来自经验或测量））。

And yes, there are several architectures where only one size integer exists - or only two sizes, such as 8-bit char and 32-bit int , and 16-bit short would be simulated by performing the 32-bit math, and then dropping the top part of the value. 是的，有几种架构只存在一个大小的整数 - 或者只有两个大小，例如8位char和32位int ，以及16位short将通过执行32位数学模拟，然后丢弃价值的顶部。 For example MIPS has only got 32-bit operations, but can store and load 16-bit values to memory. 例如， MIPS只有32位操作，但可以存储16位值并将其加载到内存中。 It doesn't necessarily make it slower, but it certainly means that it's "not faster". 它并不一定会让它变慢，但它肯定意味着它“不会更快”。