简体繁体 English

unsigned int vs. size_t

[英]unsigned int vs. size_t

原文 2008-09-25 07:00:03 5 8 c++/ c/ size-t

I notice that modern C and C++ code seems to use size_t instead of int / unsigned int pretty much everywhere - from parameters for C string functions to the STL. 我注意到现代C和C ++代码似乎在任何地方使用size_t而不是int / unsigned int - 从C字符串函数的参数到STL。 I am curious as to the reason for this and the benefits it brings. 我很好奇这个原因及其带来的好处。

8 个解决方案

The size_t type is the unsigned integer type that is the result of the sizeof operator (and the offsetof operator), so it is guaranteed to be big enough to contain the size of the biggest object your system can handle (eg, a static array of 8Gb). size_t类型是无符号整数类型，它是sizeof运算符（和offsetof运算符）的结果，因此保证它足够大以包含系统可以处理的最大对象的大小（例如，静态数组8GB）。

The size_t type may be bigger than, equal to, or smaller than an unsigned int , and your compiler might make assumptions about it for optimization. size_t类型可能大于，等于或小于unsigned int ，并且您的编译器可能会对其进行假设以进行优化。

You may find more precise information in the C99 standard, section 7.17, a draft of which is available on the Internet in pdf format, or in the C11 standard, section 7.19, also available as a pdf draft . 您可以在C99标准中找到更准确的信息，第7.17节，其草案可在因特网上以pdf格式或在C11标准第7.19节中获得，也可作为pdf草案获得。

Classic C (the early dialect of C described by Brian Kernighan and Dennis Ritchie in The C Programming Language, Prentice-Hall, 1978) didn't provide size_t . 经典C（由Brian Kernighan和Dennis Ritchie在C编程语言中描述的C的早期方言，Prentice-Hall，1978）没有提供size_t 。 The C standards committee introduced size_t to eliminate a portability problem C标准委员会引入size_t来消除可移植性问题

Explained in detail at embedded.com (with a very good example) 在embedded.com上详细解释（有一个很好的例子）

In short, size_t is never negative, and it maximizes performance because it's typedef'd to be the unsigned integer type that's big enough -- but not too big -- to represent the size of the largest possible object on the target platform. 简而言之， size_t永远不会消极，并且它会最大化性能，因为它的typedef将是无符号整数类型，足够大 - 但不是太大 - 来表示目标平台上最大可能对象的大小。

Sizes should never be negative, and indeed size_t is an unsigned type. 大小永远不应该是负数，并且size_t确实是无符号类型。 Also, because size_t is unsigned, you can store numbers that are roughly twice as big as in the corresponding signed type, because we can use the sign bit to represent magnitude, like all the other bits in the unsigned integer. 此外，由于size_t是无符号的，因此您可以存储大约是相应签名类型的两倍的数字，因为我们可以使用符号位来表示幅度，就像无符号整数中的所有其他位一样。 When we gain one more bit, we are multiplying the range of numbers we can represents by a factor of about two. 当我们再获得一位时，我们将我们可以表示的数字范围乘以大约两倍。

So, you ask, why not just use an unsigned int ? 所以，你问，为什么不使用unsigned int ？ It may not be able to hold big enough numbers. 它可能无法容纳足够多的数字。 In an implementation where unsigned int is 32 bits, the biggest number it can represent is 4294967295 . 在unsigned int为32位的实现中，它可以表示的最大数字是4294967295 。 Some processors, such as the IP16L32, can copy objects larger than 4294967295 bytes. 某些处理器（如IP16L32）可以复制大于4294967295字节的对象。

So, you ask, why not use an unsigned long int ? 所以，你问，为什么不使用unsigned long int ？ It exacts a performance toll on some platforms. 它确实在一些平台上造成了性能损失。 Standard C requires that a long occupy at least 32 bits. 标准C要求long占用至少32位。 An IP16L32 platform implements each 32-bit long as a pair of 16-bit words. IP16L32平台将每个32位长实现为一对16位字。 Almost all 32-bit operators on these platforms require two instructions, if not more, because they work with the 32 bits in two 16-bit chunks. 这些平台上的几乎所有32位运算符都需要两条指令（如果不是更多），因为它们与两个16位块中的32位一起使用。 For example, moving a 32-bit long usually requires two machine instructions -- one to move each 16-bit chunk. 例如，移动32位长通常需要两个机器指令 - 一个用于移动每个16位块。

Using size_t avoids this performance toll. 使用size_t可以避免这种性能损失。 According to this fantastic article , "Type size_t is a typedef that's an alias for some unsigned integer type, typically unsigned int or unsigned long , but possibly even unsigned long long . Each Standard C implementation is supposed to choose the unsigned integer that's big enough--but no bigger than needed--to represent the size of the largest possible object on the target platform." 根据这篇精彩文章，“Type size_t是一个typedef，它是某些无符号整数类型的别名，通常是unsigned int或unsigned long ，但可能甚至是unsigned long long 。每个Standard C实现应该选择足够大的无符号整数 - - 但不比需要大 - 表示目标平台上最大可能对象的大小。“

The size_t type is the type returned by the sizeof operator. size_t类型是sizeof运算符返回的类型。 It is an unsigned integer capable of expressing the size in bytes of any memory range supported on the host machine. 它是一个无符号整数，能够表示主机上支持的任何内存范围的字节大小。 It is (typically) related to ptrdiff_t in that ptrdiff_t is a signed integer value such that sizeof(ptrdiff_t) and sizeof(size_t) are equal. 它（通常）与ptrdiff_t有关，因为ptrdiff_t是有符号整数值，使得sizeof（ptrdiff_t）和sizeof（size_t）相等。

When writing C code you should always use size_t whenever dealing with memory ranges. 在编写C代码时，每当处理内存范围时，应始终使用size_t。

The int type on the other hand is basically defined as the size of the (signed) integer value that the host machine can use to most efficiently perform integer arithmetic. 另一方面，int类型基本上定义为主机可以用来最有效地执行整数运算的（带符号）整数值的大小。 For example, on many older PC type computers the value sizeof(size_t) would be 4 (bytes) but sizeof(int) would be 2 (byte). 例如，在许多旧的PC类型计算机上，值sizeof（size_t）将是4（字节）但sizeof（int）将是2（字节）。 16 bit arithmetic was faster than 32 bit arithmetic, though the CPU could handle a (logical) memory space of up to 4 GiB. 尽管CPU可以处理高达4 GiB的（逻辑）存储空间，但16位算术比32位算术快。

Use the int type only when you care about efficiency as its actual precision depends strongly on both compiler options and machine architecture. 只有在关心效率时才使用int类型，因为它的实际精度很大程度上取决于编译器选项和机器架构。 In particular the C standard specifies the following invariants: sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) placing no other limitations on the actual representation of the precision available to the programmer for each of these primitive types. 特别是C标准指定了以下不变量：sizeof（char）<= sizeof（short）<= sizeof（int）<= sizeof（long）对程序员可用于每个的精度的实际表示没有其他限制这些原始类型。

Note: This is NOT the same as in Java (which actually specifies the bit precision for each of the types 'char', 'byte', 'short', 'int' and 'long'). 注意：这与Java中的不同（实际上为每个类型'char'，'byte'，'short'，'int'和'long'指定了位精度）。

Type size_t must be big enough to store the size of any possible object. 类型size_t必须足够大，以存储任何可能对象的大小。 Unsigned int doesn't have to satisfy that condition. Unsigned int不必满足该条件。

For example in 64 bit systems int and unsigned int may be 32 bit wide, but size_t must be big enough to store numbers bigger than 4G 例如，在64位系统中int和unsigned int可能是32位宽，但size_t必须足够大以存储大于4G的数字

This excerpt from the glibc manual 0.02 may also be relevant when researching the topic: 在研究该主题时，glibc手册0.02的摘录也可能是相关的：

There is a potential problem with the size_t type and versions of GCC prior to release 2.4. 在版本2.4之前，size_t类型和GCC版本存在潜在问题。 ANSI C requires that size_t always be an unsigned type. ANSI C要求size_t始终是无符号类型。 For compatibility with existing systems' header files, GCC defines size_t in stddef.h' to be whatever type the system's sys/types.h' defines it to be. 为了与现有系统的头文件兼容，GCC将stddef.h' to be whatever type the system's size_t定义stddef.h' to be whatever type the system's sys / types.h定义的stddef.h' to be whatever type the system's 。 Most Unix systems that define size_t in `sys/types.h', define it to be a signed type. 大多数在`sys / types.h'中定义size_t的Unix系统将其定义为有符号类型。 Some code in the library depends on size_t being an unsigned type, and will not work correctly if it is signed. 库中的某些代码依赖于size_t是无符号类型，如果签名则无法正常工作。

The GNU C library code which expects size_t to be unsigned is correct. 期望size_t无符号的GNU C库代码是正确的。 The definition of size_t as a signed type is incorrect. size_t作为签名类型的定义不正确。 We plan that in version 2.4, GCC will always define size_t as an unsigned type, and the fixincludes' script will massage the system's sys/types.h' so as not to conflict with this. 我们计划在版本2.4中，GCC将始终将size_t定义为无符号类型，并且fixincludes' script will massage the system's sys / types.h'以免与此冲突。

In the meantime, we work around this problem by telling GCC explicitly to use an unsigned type for size_t when compiling the GNU C library. 与此同时，我们通过在编译GNU C库时明确告诉GCC使用unsigned类型的size_t来解决这个问题。 `configure' will automatically detect what type GCC uses for size_t arrange to override it if necessary. `configure'将自动检测GCC用于size_t的类型，以便在必要时覆盖它。

If my compiler is set to 32 bit, size_t is nothing other than a typedef for unsigned int . 如果我的编译器设置为32位，则size_t只是unsigned int的typedef。 If my compiler is set to 64 bit, size_t is nothing other than a typedef for unsigned long long . 如果我的编译器设置为64位，则size_t只是unsigned long long的typedef。

size_t is the size of a pointer. size_t是指针的大小。

So in 32 bits or the common ILP32 (integer, long, pointer) model size_t is 32 bits. 因此，在32位或公共ILP32（整数，长，指针）模型中，size_t是32位。 and in 64 bits or the common LP64 (long, pointer) model size_t is 64 bits (integers are still 32 bits). 并且在64位或公共LP64（长，指针）模型中，size_t是64位（整数仍然是32位）。

There are other models but these are the ones that g++ use (at least by default) 还有其他模型，但这些是g ++使用的模型（至少默认情况下）