[英]Question in number of bits in C programming
If I do如果我做
int a = 3 , then 3 will be represented in binary with 32 bits. int a = 3 ,则 3 将用 32 位二进制表示。
If I do如果我做
char a = 3 , then 3 will be represented in binary with 8 bits. char a = 3 ,则 3 将用 8 位二进制表示。
My question is before doing the initialization with the value, how many bits does 3 get represented with?我的问题是在使用值进行初始化之前, 3代表多少位?
(In other words, how many bits does the "3" has on the right-hand side of the equal sign) (换句话说,“3”在等号右边有多少位)
It's very common that int
has 32 bit, but it's not guaranteed. int
具有 32 位是很常见的,但不能保证。 It can be 16 or 64 too.它也可以是 16 或 64。 Or higher.
或更高。
A single 3
is an integer literal of type int
.单个
3
是int
类型的 integer 文字。
You can check it using sizeof
operator.您可以使用
sizeof
运算符检查它。 It will give you size of argument in bytes.它会给你以字节为单位的参数大小。 Just try to get size of
int
, a
and 3
.只需尝试获取
int
、 a
和3
的大小。
#include <stdio.h>
int main()
{
int a = 3;
printf("%ld\n", sizeof(a)); // gives 4 bytes (32 bit) on my PC
printf("%ld\n", sizeof(int)); // gives 4 bytes (32 bit) on my PC
printf("%ld\n", sizeof(3)); // gives 4 bytes (32 bit) on my PC
return 0;
}
Also, 3
has type of int
.此外,
3
的类型为int
。 So its size is equal to size of int
.所以它的大小等于
int
的大小。
The size of an object of the type int
is implementation defined The standard makes only the requirement that INT_MAX
shall not be less than +32767
that is 2 ^ 15 - 1
int
类型的 object 的大小是实现定义的 该标准仅要求INT_MAX
不得小于+32767
,即2 ^ 15 - 1
If in your system the size of an object of the type int
is equal to 4 then an integer constant like 3
will occupy a block of memory equal to 4
bytes.如果在您的系统中,
int
类型的 object 的大小等于 4,那么像3
这样的 integer 常量将占用一个等于4
个字节的 memory 块。
Pay attention to that for example character integer constant are also have the type int
.请注意,例如字符 integer 常量也具有
int
类型。
So in the both these declarations所以在这两个声明中
char a = 3;
and和
char a = '\3';
the constants 3
and '\3'
having the type int
occupy 4
bytes if sizeof( int )
is equal to 4
.如果
sizeof( int )
等于4
,则具有int
类型的常量3
和'\3'
占用4
个字节。
The 3
is called an integer constant and it has a type much like any named variable. 3
被称为integer 常量,它的类型很像任何命名变量。 It is always type int
if the number typed can fit inside an int
.如果键入的数字可以放入
int
int
Otherwise, if it can't fit, the compiler will try to fit it inside a long
, then long long
.否则,如果它不适合,编译器会尝试将它放入一个
long
中,然后是long long
。
There's various rather intricate rules for how this is done, I won't mention all the dirty details here - those who are interested in that can check the tables in the C standard 6.4.4.1.有各种相当复杂的规则来说明如何做到这一点,我不会在这里提及所有肮脏的细节——对此感兴趣的人可以查看 C 标准 6.4.4.1 中的表格。 For the average programmer it is probably enough to know that we can also enforce the integer constant to be unsigned by adding a
U
suffix or force it to be long
by adding a L
suffix.对于普通程序员来说,知道我们还可以通过添加
U
后缀来强制 integer 常量无符号或通过添加L
后缀强制它变long
可能就足够了。 That is 3U
or 3L
or a combination 3UL
.即
3U
或3L
或组合3UL
。 (Lower case u
and l
works too.) (小写的
u
和l
也可以。)
On real-world computers, int
is always 2 or 4 bytes large.在现实世界的计算机上,
int
总是 2 或 4 字节大。 long
is either 4 or 8 bytes large. long
是 4 或 8 字节大。 Example from a 64 bit Linux computer with 4 byte int
and 8 byte long
:来自具有 4 字节
int
和 8 字节long
的 64 位 Linux 计算机的示例:
#include <stdio.h>
int main (void)
{
printf("%zu\n", sizeof(int)); // 4
printf("%zu\n", sizeof(3)); // 4
printf("%zu\n", sizeof(3L)); // 8
printf("%zu\n", sizeof(2147483647)); // 4, fits int
printf("%zu\n", sizeof(2147483648)); // 8, doesnt fit
}
The question "how many bits does 3 get represented with?"问题“3 代表多少位?” is actually tricky.
实际上很棘手。 If we can find the 3 then we can answer it.
如果我们能找到3,那么我们就可以回答它。 So the question is: where is the 3 ?
所以问题是: 3 在哪里?
What really happens is that:真正发生的是:
int a = 3;
is the same as:是相同的:
int a;
a = 3;
The compiler makes sure there will be 4 bytes of space for the variable a
(it does this at compile time), and then it also puts an instruction in the program which stores the number 3 in that space when you run the program.编译器确保变量
a
将有 4 个字节的空间(它在编译时执行此操作),然后它还会在程序中放置一条指令,当您运行程序时将数字 3 存储在该空间中。
We can use this useful online tool to compile a program and see what assembly/machine code the compiler actually outputs: https://godbolt.org/z/a9Pohn我们可以使用这个有用的在线工具来编译程序并查看编译器实际输出的汇编/机器代码: https://godbolt.org/z/a9Pohn
In this case, I entered the program:在这种情况下,我进入了程序:
int main() {
int a;
a = 3;
}
and compiled it with "x86-64 gcc 10.2", no optimizations.并用“x86-64 gcc 10.2”编译它,没有优化。 Here is the compiled code (both assembly and machine code):
这是编译后的代码(汇编代码和机器代码):
main:
55 push rbp
48 89 e5 mov rbp,rsp
c7 45 fc 03 00 00 00 mov DWORD PTR [rbp-0x4],0x3
b8 00 00 00 00 mov eax,0x0
5d pop rbp
c3 ret
If we can read assembly we can see that the instruction the compiler chose to insert into the program, to initialize the variable a
, was mov DWORD PTR [rbp-0x4],0x3
.如果我们可以阅读汇编,我们可以看到编译器选择插入到程序中来初始化变量
a
的指令是mov DWORD PTR [rbp-0x4],0x3
。 And in machine code it is written c7 45 fc 03 00 00 00
.在机器代码中,它写成
c7 45 fc 03 00 00 00
。 The instruction is where the number 3 comes from.指令是数字 3 的来源。
The instruction is 7 bytes long.该指令长 7 个字节。
c7 45
tell the CPU what kind of instruction this is ("put a specific number at a specific position in the stack frame"). c7 45
告诉 CPU 这是什么类型的指令(“在堆栈帧中的特定 position 中放入特定数字”)。 fc
is the position in the stack frame. fc
是堆栈帧中的 position。 And 03 00 00 00
is the specific number which it puts there (in little-endian format). 03 00 00 00
是它放在那里的特定数字(以小端格式)。 This is the number 3 in the source code.这是源代码中的数字 3。 So in this case, it takes up 4 bytes.
所以在这种情况下,它占用了 4 个字节。
Note that it's not always the same.请注意,它并不总是相同的。 If we compile for an ARM CPU instead of x86-64, then these are the relevant instructions
如果我们为 ARM CPU 而不是 x86-64 编译,那么这些是相关指令
mov r3, #3
str r3, [fp, #-8]
Unfortunately godbolt won't show us machine code, but we can look up the MOV # instruction in the ARM manual which tells us that the instruction is 4 bytes long, and the number being moved only takes up 2 of those bytes.不幸的是,godbolt 不会向我们显示机器代码,但我们可以在 ARM 手册中查找 MOV # 指令,该指令告诉我们该指令长 4 个字节,而被移动的数字仅占用其中 2 个字节。 The other bits are automatically zeroes, If you use a number that doesn't fit in 2 bytes.
如果您使用的数字不适合 2 个字节,则其他位自动为零。 obviously it uses a different instruction.
显然它使用了不同的指令。
Usually we don't talk about the sizes of instructions since they vary a lot more than data sizes.通常我们不会谈论指令的大小,因为它们的变化比数据大小要大得多。
int a;
always reserves 4 bytes (if your system's int
is 4 bytes) but the instruction which puts the specific bits into that space can have varying sizes.总是保留 4 个字节(如果您的系统的
int
是 4 个字节),但是将特定位放入该空间的指令可以具有不同的大小。
Even on x86-64, numbers can take up different amounts of space.即使在 x86-64 上,数字也可以占用不同数量的空间。 If I do
return 0;
如果我确实
return 0;
, the compiler translates that to xor eax, eax
( 31 c0
). ,编译器将其转换为
xor eax, eax
( 31 c0
)。 There is no number 0
in that instruction at all!该指令中根本没有数字
0
! ( 31 c0
is all type of instruction, no data) (
31 c0
是所有类型的指令,没有数据)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.