简体繁体 English

关于整数内存分配

[英]Regarding integer memory allocation

原文 2012-10-10 07:04:29 3 4 c/ unix/ integer/ 32bit-64bit/ biginteger

If the memory allocated to an integer has a limit (say 2 bytes in C or 4 bytes or 8 bytes) in any language. 如果分配给整数的内存在任何语言中都有一个限制（比如C中的2个字节或4个字节或8个字节）。 How does it matter to compile the code on 32-bit or 64-bit machine using 32-bit or 64-bit compiler. 如何使用32位或64位编译器在32位或64位机器上编译代码。 Please forgive me if it is really trivial question. 如果这是一个非常微不足道的问题，请原谅我。 But please leave an answer. 但请留下答案。

4 个解决方案

If you are using fixed-size integer types (like int8_t or int16_t ) then whether you target a 32- or 64-bit platform doesn't matter much. 如果您使用固定大小的整数类型（如int8_t或int16_t ），那么无论您是针对32位还是64位平台都无关紧要。

One of the things that does matter is the size of pointers. 其中一件重要的事就是指针的大小。 All pointers are 32 bits when targeting a 32-bit architecture, and 64 bits when targeting a 64-bit architecture. 当针对32位架构时，所有指针都是32位，而当针对64位架构时，所有指针都是64位。

It used to be rather common to store pointer values in an int , though this practice has become very discouraged for portability reasons, and the 32/64-bit case is a great example. 过去常常将指针值存储在int ，尽管出于可移植性的原因，这种做法已经变得非常沮丧，而32/64位的情况就是一个很好的例子。 If you store a pointer in an int , then your code will invoke undefined behavior on 64-bit architectures, as you truncated a pointer. 如果将指针存储在int ，则在截断指针时，代码将在64位体系结构上调用未定义的行为。 When you would go to extract the pointer, you'd dereference it likely crash, or (worse) proceed with invalid data. 当你去提取指针时，你会取消引用它可能会崩溃，或者（更糟糕的是）继续使用无效数据。

There are several reasons why you have to compile different executables between 32-bit and 64-bit machines - the size of an int might not be a factor, or it might, since the C standard only defines minimums and relative sizes - there is no maximum size of an int so far as I know (provided it is not longer than a long). 有几个原因导致你必须在32位和64位机器之间编译不同的可执行文件 - int的大小可能不是一个因素，或者它可能，因为C标准只定义了最小值和相对大小 - 没有据我所知，int的最大大小（假设它不长于一个长）。

The size of a pointer is a major difference. 指针的大小是一个主要区别。 The compiler and linker produce a different layout of executable file between 32 and 64 bit process address spaces. 编译器和链接器在32位和64位进程地址空间之间生成不同的可执行文件布局。 The runtime libraries are different, and dynamically linked libaries ( shared objects on UNIX) have to share the same size pointers, otherwise they cannot interact with the rest of the process. 运行时库是不同的，动态链接的库（UNIX上的共享对象 ）必须共享相同大小的指针，否则它们无法与进程的其余部分交互。

Why use 64-bit? 为什么要使用64位？ What is the advantage of 64-bit over 32-bit? 64位超过32位的优势是什么？ The main advantage is the maximum size of a pointer, and hence process-address space. 主要优点是指针的最大大小，因此是进程地址空间。 On 32-bit this is 4GB, on 64-bit it is 16EB (about 16,000 terabytes). 32位是4GB，64位是16EB（大约16,000太字节）。

If you have a look at this page , you can see that the basic types in C have certain guaranteed minimum sizes. 如果您查看此页面，可以看到C中的基本类型具有一定的保证最小尺寸。 So, you will not find a compliant C implementation where int is 2 bits, it has to be at least 16. 因此，您将找不到兼容的C实现，其中int是2位，它必须至少为16。

Differences between platforms makes porting software into the interesting challenge it often is. 平台之间的差异使得将软件移植到它经常遇到的有趣挑战中。

If you have code that assumes things about basic data types that are not guaranteed to be true (for instance, code that does something like this: int x = 0xfeedf00d; ), then that code will not be portable. 如果您的代码假定基本数据类型不能保证为真（例如，执行类似这样的代码： int x = 0xfeedf00d; ），那么该代码将不可移植。 It will break, in various often hard to predict ways, when compiled on a platform that doesn't match the assumptions. 当在与假设不匹配的平台上编译时，它将以各种通常难以预测的方式破坏。 For instance, on a platform where int is 16 bits, the above code would leave x set to some different value from what the programmer intended. 例如，在int为16位的平台上，上面的代码会将x设置为与程序员预期的值不同的值。

For the most part the word size affects performance and stack usage. 在大多数情况下，字大小会影响性能和堆栈使用。 It's fastest to work with whatever the word size is for a given architecture. 对于给定的体系结构，无论字大小是什么，它都是最快的。 If you define two 32 bit integers, one right after the other, on a 64 bit machine, one will be on a word boundary and the other won't. 如果你定义了两个32位整数，一个接一个，在64位机器上，一个将在一个字边界而另一个不在。 It takes less processing to get the one on the word boundary into a register than the one that's not on a word boundary. 将字边界上的那个放入寄存器所需的处理比不在字边界上的处理更少。 On the other hand, if you have a 64 bit integer defined on a 32 bit machine, it will take two fetches to get the two words and some funky register manipulation to perform integer operations on it. 另一方面，如果在32位机器上定义了64位整数，则需要两次读取才能获得这两个字，并进行一些时髦的寄存器操作以对其执行整数运算。 The last part has to do with the stack. 最后一部分与堆栈有关。 The stack is always made up of words. 堆栈总是由单词组成。 If the machine's word size is 32 bit, the stack will be a stack of 32 bit values, and on a 64 bit machine they are 64 bit values. 如果机器的字大小是32位，则堆栈将是32位值的堆栈，而在64位机器上，它们是64位值。 Not that you really care that much, but a 32 bit integer will be placed in a 64 bit word on the stack, and a 64 bit value, unless you're pushing a pointer to the value, will take up two words on the stack. 并不是说你真的非常在乎，但是32位整数将被放置在堆栈上的64位字中，并且64位值，除非你按指针指向该值，否则将占用堆栈上的两个字。

If you select word-alignment in your compiler, it will automatically put all variable on a word boundary. 如果在编译器中选择字对齐，它将自动将所有变量放在字边界上。 This is much faster, but takes up a bit more space. 这要快得多，但占用的空间更多。 Unless you're really pressed for space, you should go for the performance. 除非你真的想要空间，否则你应该去表演。 Not that it's that big a difference on a CISC architecture. 并不是说它在CISC架构上有很大的不同。 On RISC it made a huge difference. 在RISC上，它产生了巨大的变化。