简体繁体 English

ISO/IEC 9899:1990 编程语言 C 关于short int, int, long 的定义

[英]ISO/IEC 9899:1990 programming Language C definition about short int, int, long

原文 2020-06-30 02:56:21 1 2 c

Blockquote "ISO/IEC 9899:1990, Programming Languages - C (ISO C) left the definition of the short int, the int, the long int, and the pointer deliberately vague to avoid artificially constraining hardware architectures that might benefit from defining these data types independent from the other. The only constraints were that ints must be no smaller than shorts, and longs must be no smaller than ints, and size_t must represent the largest unsigned type supported by an implementation. It is possible, for instance, to define a short as 16 bits, an int as 32 bits, a long as 64 bits and a pointer as 128 bits. The relationship between the fundamental data types can be expressed as: sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) = sizeof(size_t)" http://www.unix.org/whitepapers/64bit.html Blockquote “ISO/IEC 9899:1990，编程语言 - C (ISO C) 故意模糊了 short int、int、long int 和指针的定义，以避免人为地限制可能从定义这些数据中受益的硬件架构类型相互独立。唯一的限制是 int 必须不小于 shorts，longs 必须不小于 ints，并且 size_t 必须表示实现支持的最大无符号类型。例如，可以定义short 为 16 位，int 为 32 位，long 为 64 位，指针为 128 位。基本数据类型之间的关系可以表示为：sizeof(char) <= sizeof(short) <= sizeof( int) <= sizeof(long) = sizeof(size_t)" http://www.unix.org/whitepapers/64bit.html

Why we need to define these data type so vague为什么我们需要将这些数据类型定义得如此模糊

Is that because we have different computer architectures so that we can't set the int to a fixed size of 32bit?那是因为我们有不同的计算机体系结构，所以我们不能将 int 设置为 32 位的固定大小吗？

And what's the difference between long and int64? long 和 int64 有什么区别？ Is that the size of long will be determined by system & int64 is guaranteed to be 64bit? long 的大小是否由系统决定，int64 保证为 64 位？

Thanks for help感谢帮助

2 个解决方案

Why we need to define these data type so vague为什么我们需要将这些数据类型定义得如此模糊

According to your excerpt, the reason was根据您的摘录，原因是

to avoid artificially constraining hardware architectures that might benefit from defining these data types independent from the other避免人为地限制可能受益于独立于其他数据类型定义这些数据类型的硬件架构

I find that wording a little awkward, though.不过，我觉得这个措辞有点尴尬。 The basic idea is that the standard allows C implementations for different hardware architectures to choose sizes for the various types that are naturally suited to the hardware.基本思想是，该标准允许针对不同硬件架构的 C 实现来选择自然适合硬件的各种类型的大小。 This is not just about 32-bit vs .这不仅仅是 32 位与. 64-bit, by the way.顺便说一句，64位。 I have personally used 8-bit, 16-bit, 32-bit, and 64-bit computers, and I have worked on software that was originally written for computers with 36-bit and other native word sizes as well.我个人使用过 8 位、16 位、32 位和 64 位计算机，并且我曾开发过最初为具有 36 位和其他原生字长的计算机编写的软件。 And that's just what I can claim any kind of personal connection to.这就是我可以声称与之有任何个人联系的原因。 The past, present, and likely future diversity of computing hardware is much greater than I suspect you appreciate, but C can be implemented efficiently on a very wide variety of it.计算硬件的过去、现在和可能的未来的多样性比我怀疑你理解的要大得多，但是 C 可以在非常广泛的种类上有效地实现。

And what's the difference between long and int64? long 和 int64 有什么区别？ Is that the size of long will be determined by system & int64 is guaranteed to be 64bit? long 的大小是否由系统决定，int64 保证为 64 位？

The C language does not define any type named int64 . C 语言没有定义任何名为int64的类型。 Especially C90, the version referenced by your excerpt, does not provide one.尤其是您摘录中引用的版本C90，没有提供。 More recent versions of C define a type int64_t , which implementations are not required to provide.更新版本的 C 定义了一个类型int64_t ，实现不需要提供。 Where it is available, it is an integer type with exactly one sign bit, 63 value bits, and no padding bits, represented in two's-complement form.如果可用，它是一个 integer 类型，只有一个符号位、63 个值位和没有填充位，以二进制补码形式表示。 On some systems, long and int64_t are the same type, whereas on others, they are different types.在某些系统上， long和int64_t是相同的类型，而在其他系统上，它们是不同的类型。 On yet others, there is no int64_t .还有一些，没有int64_t 。 In Microsoft's C implementation, for example, long is a 32-bit type even on 64-bit hardware.例如，在 Microsoft 的 C 实现中， long是 32 位类型，即使在 64 位硬件上也是如此。

First of all, it must be noted that C was invented during a very early computer era, based on B and BCPL languages from the 1960s.首先，必须注意的是，C 是在非常早期的计算机时代发明的，基于 1960 年代的 B 和 BCPL 语言。 Lots of different experimental computers existed back then - nobody quite knew which ones would survive or become industry standard.当时存在许多不同的实验计算机——没有人完全知道哪些计算机会幸存下来或成为行业标准。

Because of this, the C language even supports three different forms of signed number formats: 1's complement, 2's complement and signed magnitude.正因为如此，C 语言甚至支持三种不同的 forms 有符号数格式：1 的补码、2 的补码和有符号的幅度。 Where 1's complement and signed magnitude are allowed to come with exotic behavior such as trap representations or padding bits.其中 1 的补码和有符号幅度允许带有特殊行为，例如陷阱表示或填充位。 But some 99.999% of all modern real-world computers use 2's complement, so all of this is very unhelpful.但是大约 99.999% 的现代现实世界计算机使用 2 的补码，所以所有这些都非常无用。

Why we need to define these data type so vague为什么我们需要将这些数据类型定义得如此模糊

We don't.我们没有。 Not giving the integer types a fixed size and signedness was arguably a naive design mistake.没有给 integer 类型提供固定大小和签名可以说是一个幼稚的设计错误。 The rationale back in the days was to allow C to run on as many different computers as possible.过去的基本原理是允许 C 在尽可能多的不同计算机上运行。 Which is as it turns out, not at all the same thing as porting C code between different computers.事实证明，这与在不同计算机之间移植C 代码完全不同。

Lazy programmers might find it handy to sloppily spam int everywhere without thinking about integer limits, then get a "suitable, large enough integer of the local signedness".懒惰的程序员可能会发现在不考虑int限制的情况下到处乱扔垃圾邮件很方便，然后得到一个“合适的、足够大的 integer 的本地签名”。 But that's not in the slightest helpful when we for example need to use exactly 16 bits 2's complement.但是，例如，当我们需要使用 16 位 2 的补码时，这一点帮助都没有。 Or when we need to optimize for size.或者当我们需要优化尺寸时。 Or when we are using an 8 bit CPU and want to avoid anything larger than 8 bits whenever possible.或者当我们使用 8 位 CPU 并希望尽可能避免任何大于 8 位的东西时。

So int & friends are not quite portable: the size and signedness format is unknown and inconsistent across platforms, making these so-called "primitive data types" potentially dangerous and/or inefficient.所以int & friends 不太便携：大小和签名格式是未知的并且跨平台不一致，使得这些所谓的“原始数据类型”具有潜在的危险和/或效率低下。

To make things worse, the unpredictable behavior of int collides with other language flaws like implicit int type promotion (see Implicit type promotion rules ), or the fact that integer constants like 1 are always int .更糟糕的是， int的不可预测行为与其他语言缺陷相冲突，例如隐式 int 类型提升（请参阅隐式类型提升规则），或者像1这样的 integer 常量始终为int的事实。 These rules were meant to turn every expression into int , to save incompetent programmers from themselves, in case they did arithmetic with overflow on small, signed integer types.这些规则旨在将每个表达式都转换为int ，以使不称职的程序员免于自己的麻烦，以防他们对小的签名 integer 类型进行溢出运算。

For example int8_t i8=0; ... i8 = i8 + 256;例如int8_t i8=0; ... i8 = i8 + 256; int8_t i8=0; ... i8 = i8 + 256; doesn't actually cause signed overflow in C, because the operation is carried out on type int , which is then converted back to the small integer type int8_t (although in an implementation-defined manner).实际上不会导致 C 中的有符号溢出，因为该操作是在int类型上执行的，然后将其转换回小的 integer 类型int8_t （尽管以实现定义的方式）。

However, the implicit promotion rules always caused more harm than good.然而，隐含的晋升规则总是弊大于利。 Your unsigned short may suddenly and silently turn into a signed int when ported from a 16 bit system to a 32 bit system.当从 16 位系统移植到 32 位系统时，您的unsigned short可能会突然无声地变成有signed int 。 Which in turn can create all manner of subtle bugs, particularly when using bitwise operators/writing hardware-related code.这反过来又会产生各种微妙的错误，尤其是在使用按位运算符/编写与硬件相关的代码时。 And the rules create an inconsistency between how small integer types and large integer types work inside expressions.这些规则在小型 integer 类型和大型 integer 类型在表达式中的工作方式之间产生了不一致。

To solve some of these problems, stdint.h was introduced in the language back in 1999. It contains types like uint8_t that are guaranteed to have a fixed size no matter system.为了解决其中一些问题， stdint.h早在 1999 年就在该语言中引入。它包含像uint8_t这样的类型，无论系统如何，都保证具有固定大小。 And they are guaranteed to be 2's complement.并且保证它们是 2 的补码。 In addition, we may use types like uint_fast8_t to let the compiler pick the fastest suitable type for a given system, portably.此外，我们可以使用像uint_fast8_t这样的类型来让编译器为给定的系统选择最适合的类型，可移植。 Most professional C software nowadays - embedded systems in particular - only ever use the stdint.h types and never the native types.现在最专业的 C 软件 - 特别是嵌入式系统 - 只使用stdint.h类型，从不使用本机类型。

stdint.h makes it easier to port code, but it doesn't really solve the implicit promotion problems. stdint.h使移植代码更容易，但它并不能真正解决隐式提升问题。 To solve those, the language would have to be rewritten with a stronger type system and enforce that all integer converts have to be explicit with casts.为了解决这些问题，必须使用更强大的类型系统重写该语言，并强制所有 integer 转换必须使用强制转换显式。 Since there is no hope of C ever getting fixed, safe subsets of the language were developed, such as MISRA-C and CERT-C.由于 C 没有希望得到修复，因此开发了该语言的安全子集，例如 MISRA-C 和 CERT-C。 A significant portion of these documents are dedicated to solving implicit conversion bugs.这些文档中有很大一部分致力于解决隐式转换错误。

A note about size_t specifically, it is guaranteed to be unsigned and "large enough", but that's about it.特别是关于size_t的注释，它保证是无符号的并且“足够大”，但仅此而已。 They didn't really give enough thought about defining what it's supposed to represent.他们并没有真正考虑到定义它应该代表什么。 The maximum size of an object? object 的最大尺寸？ An array?数组？ Or just the type returned by sizeof ?或者只是sizeof返回的类型？ There's an unexpected dependency between it and ptrdiff_t - another language flaw - see this exotic problem I ran into when using size_t to represent the maximum allowed size of an array.它和ptrdiff_t之间存在意外的依赖关系 - 另一个语言缺陷 - 请参阅我在使用size_t表示数组的最大允许大小时遇到的这个奇异问题。