简体繁体 English

为什么 C99 标准的作者不为浮点类型的大小指定一个标准？

[英]Why don't the authors of the C99 standard specify a standard for the size of floating point types?

原文 2010-08-09 20:53:07 9 4 c/ floating-point/ sizeof/ c99/ standardization

I noticed on Windows and Linux x86, float is a 4-byte type, double is 8, but long double is 12 and 16 on x86 and x86_64 respectively.我注意到在 Windows 和 Linux x86 上，float 是 4 字节类型，double 是 8，但是 long double 在 x86 和 x86_64 上分别是 12 和 16。 C99 is supposed to be breaking such barriers with the specific integral sizes. C99 应该打破这些具有特定整体尺寸的障碍。

The initial technological limitation appears to be due to the x86 processor not being able to handle more than 80-bit floating point operations (plus 2 bytes to round it up) but why the inconsistency in the standard compared to int types?最初的技术限制似乎是由于 x86 处理器无法处理超过 80 位的浮点运算（加上 2 个字节来四舍五入），但为什么与 int 类型相比标准不一致？ Why don't they go at least to 80-bit standardization?他们为什么不至少进行 80 位标准化？

4 个解决方案

The C language doesn't specify the implementation of various types, so that it can be efficiently implemented on as wide a variety of hardware as possible. C 语言没有指定各种类型的实现，因此它可以在尽可能广泛的硬件上有效实现。

This extends to the integer types too - the C standard integral types have minimum ranges (eg. signed char is -127 to 127, short and int are both -32,767 to 32,767, long is -2,147,483,647 to 2,147,483,647, and long long is -9,223,372,036,854,775,807 to 9,223,372,036,854,775,807).这也扩展到整数类型 - C 标准整数类型具有最小范围（例如， signed char是 -127 到 127， short和int都是 -32,767 到 32,767， long是 -2,147,483,647 到 2,147,483,647，而long long是 -9,2203,7875到 9,223,372,036,854,775,807）。 For almost all purposes, this is all that the programmer needs to know.对于几乎所有目的，这就是程序员需要知道的全部内容。

C99 does provide "fixed-width" integer types, like int32_t - but these are optional - if the implementation can't provide such a type efficiently, it doesn't have to provide it. C99 确实提供“固定宽度”整数类型，如int32_t - 但这些是可选的- 如果实现不能有效地提供这种类型，它不必提供它。

For floating point types, there are equivalent limits (eg double must have at least 10 decimal digits worth of precision).对于浮点类型，存在等效限制（例如， double必须具有至少 10 位十进制数字的精度）。

他们试图（主要）适应预先存在的 C 实现，其中一些甚至不使用 IEEE 浮点格式。

int s can be used to represent abstract things like ids, colors, error code, requests, etc. In this case int s are not really used as integers numbers but as sets of bits (= a container). int s 可用于表示抽象事物，如 id、颜色、错误代码、请求等。在这种情况下， int s 不是真正用作整数，而是用作位集（= 容器）。 Most of the time a programmer knows exactly how many bits he needs, so he wants to be able to use just as many bits as needed.大多数情况下，程序员确切地知道他需要多少位，因此他希望能够根据需要使用尽可能多的位。

float s on the other hand are design for a very specific usage (floating point arithmetic).另一方面， float是为非常特定的用途（浮点运算）而设计的。 You are very unlikely to be able to size precisely how many bits you need for your float .您不太可能能够精确地确定您的float需要多少位。 Actually, most of the time the more bits you have the better it is.实际上，大多数情况下，您拥有的位越多越好。

C99 is supposed to be breaking such barriers with the specific integral sizes. C99 应该打破这些具有特定整体尺寸的障碍。

No, those fixed-width (u)int N _t types are completely optional because not all processors use type sizes that are a power of 2. C99 only requires that (u)int_fast N _t and (u)int_least N _t to be defined.不，那些固定宽度的(u)int N _t类型是完全可选的，因为并非所有处理器都使用 2 的幂的类型大小。 C99 只要求定义(u)int_fast N _t和(u)int_least N _t . That means the premise why the inconsistency in the standard compared to int types is just plain wrong because there's no consistency in the size of int types这意味着为什么标准与 int 类型不一致的前提是完全错误的，因为 int 类型的大小没有一致性

Lots of modern DSPs use 24-bit word for 24-bit audio.许多现代 DSP 将 24 位字用于 24 位音频。 There are even 20-bit DSPs like the Zoran ZR3800x family or 28-bit DSPs like the ADAU1701 which allows transformation of 16/24-bit audio without clipping.甚至还有像卓然 ZR3800x 系列这样的 20 位 DSP 或像ADAU1701这样的28 位 DSP ，它允许在不削波的情况下转换16/24位音频。 Many 32 or 64-bit architectures also have some odd-sized registers to allow accumulation of values without overflow, for example the TI C5500 / C6000 with 40-bit long and SHARC with 80-bit accumulator.许多 32 位或 64 位架构还具有一些奇数大小的寄存器，以允许在不溢出的情况下累加值，例如，具有40 位long的 TI C5500 / C6000和具有 80 位累加器的SHARC 。 The Motorola DSP5600x/3xx series also has odd sizes: 2-byte short, 3-byte int, 6-byte long. Motorola DSP5600x/3xx 系列也有奇数大小：2 字节短、3 字节整数、6 字节长。 In the past there were lots of architectures with other word sizes like 12, 18, 36, 60-bit... and lots of CPUs that use one's complement of sign-magnitude .过去，有很多架构具有其他字长，例如 12、18、36、60 位……以及许多使用符号大小补码的 CPU 。 See Exotic architectures the standards committees care about查看标准委员会关心的异国架构

C was designed to be flexible to support all kinds of such platforms. C 旨在灵活地支持各种此类平台。 Specifying a fixed size, whether for integer or floating-point types, defeats that purpose.指定固定大小，无论是整数类型还是浮点类型，都违背了这一目的。 Floating-point support in hardware varies wildly just like integer support.硬件中的浮点支持变化很大，就像整数支持一样。 There are different formats that use decimal, hexadecimal or possibly other bases.有使用十进制、十六进制或其他可能的基数的不同格式。 Each format has different sizes of exponent/mantissa, different position of sign/exponent/mantissa and even the signed format.每种格式都有不同大小的指数/尾数，不同的符号/指数/尾数位置，甚至有符号格式。 For example some use two's complement for the mantissa while some others use two's complement for the exponent or the whole floating-point value.例如，有些对尾数使用二进制补码，而有些则对指数或整个浮点值使用二进制补码。 You can see many formats here but that's obviously not every format that ever existed.您可以在这里看到许多格式，但这显然不是曾经存在的所有格式。 For example the SHARC above has a special 40-bit floating-point format.例如，上面的 SHARC 有一个特殊的 40 位浮点格式。 Some platforms also use double-double arithmetic for long double .一些平台还对long double使用double-double 算法。 See also也可以看看

That means you can't standardize a single floating-point format for all platforms because there's no one-size-fits-all solution.这意味着您无法为所有平台标准化单一浮点格式，因为没有一刀切的解决方案。 If you're designing a DSP then obviously you need to have a format that's best for your purpose so that you can churn as most data as possible.如果您正在设计 DSP，那么显然您需要一种最适合您目的的格式，以便您可以搅动尽可能多的数据。 There's no reason to use IEEE-754 binary64 when a 40-bit format has enough precision for your application, fits better in cache and needs far less die size.当 40 位格式对您的应用程序具有足够的精度、更适合缓存并且需要的管芯尺寸要小得多时，就没有理由使用 IEEE-754 binary64。 Or if you're on a small embedded system then 80-bit long double is usually useless as you don't even have enough ROM for that 80-bit long double library.或者，如果您使用的是小型嵌入式系统，那么 80 位long double精度通常是无用的，因为您甚至没有足够的 ROM 来容纳 80 位long double精度库。 That's why some platforms limit long double to 64-bit like double这就是为什么有些平台将long double限制为 64 位，如double