简体   繁体   English

C中的内存寻址和指针

[英]Memory addressing and pointers in C

This is taken from C, and is based on that. 这取自C,并基于此。 Let's imagine we have a 32 bit pointer 假设我们有一个32位指针

char* charPointer;

It points into some place in memory that contains some data. 它指向内存中包含某些数据的某个位置。 It knows that increments of this pointer are in 1 byte, etc. On the other hand, 它知道此指针的增量以1字节为单位,以此类推。另一方面,

int* intPointer;

also points into some place in memory and if we increase it it knows that it should go up by 4 bytes if we add 1 to it. 也指向内存中的某个位置,如果我们增加它,它知道如果我们加1,它应该增加4个字节。

Question is, how are we able to address full 32 bits of addressable space (2^32) - 4 gigabytes with those pointers, if obviously they contain some information in them that allows them to be separated one from another, for example char* or int* , so this leaves us with not 32 bytes, but with less. 问题是,如果这些指针显然包含一些信息,可以使它们彼此分开,例如char*char* ,那么我们如何能够使用这些指针来寻址全部32位可寻址空间(2 ^ 32)-4 GB。 int* ,所以这不给我们留下32个字节,而是更少。

When typing this question I came to thinking, maybe it is all syntatic sugar and really for compiler? 当我输入这个问题时,我想到了,也许都是语法糖,真的适合编译器吗? Maybe raw pointer is just 32 bit and it doesn't care of the type? 也许原始指针仅为32位,并且与类型无关? Is it the case? 是这样吗

You might be confused by compile time versus run time. 您可能会对编译时间与运行时间感到困惑。

During compilation, gcc (or any C compiler) knows the type of a pointer, in particular knows the type of the data pointed by that pointer variable. 在编译过程中, gcc (或任何C编译器)知道指针的类型,尤其是知道该指针变量指向的数据的类型。 So gcc can emit the right machine code. 因此, gcc可以发出正确的机器代码。 So an increment of a int * variable (on a 32 bits machine having 32 bits int ) is translated to an increment of 4 (bytes), while an increment of a char* variable is translated to an increment of 1. 因此,将int *变量的增量(在具有32位int的32位计算机上)转换为4(字节)的增量,而将char*变量的增量转换为1的增量。

During runtime, the compiled executable (it does not care or need gcc ) is only dealing with machine pointers, usually addresses of bytes (or of the start of some word). 在运行时,编译的可执行文件(它不在乎或不需要gcc )仅处理机器指针,通常是字节地址(或某个字的开头)。

Types (in C programs) are not known during runtime. 类型(在C程序中)在运行时未知。

Some other languages (Lisp, Python, Javascript, ....) require the types to be known at runtime. 其他一些语言(Lisp,Python,Javascript等)要求在运行时知道这些类型。 In recent C++ (but not C) some objects (those having virtual functions) may have RTTI . 在最近的C ++(但不是C)中,某些对象(具有虚拟功能的对象)可能具有RTTI

It is indeed syntactic sugar. 它确实是语法糖。 Consider the following code fragment: 考虑以下代码片段:

int t[2];
int a = t[1];

The second line is equivalent to: 第二行等效于:

int a = *(t + 1); // pointer addition

which itself is equivalent to: 它本身等效于:

int a = *(int*)((char*)t + 1 * sizeof(int)); // integer addition

After the compiler has checked the types it drops the casts and works only with addresses, lengths and integer addition. 编译器检查类型之后,它将放弃强制类型转换,仅适用于地址,长度和整数加法。

Yes. 是。 Raw pointer is 32 bits of data (or 16 or 64 bits, depending on architecture), and does not contain anything else. 原始指针是32位数据(或16位或64位,具体取决于体系结构),并且不包含其他任何内容。 Whether it's int * , char * , struct sockaddr_in * is just information for compiler, to know what is the number to actually add when incrementing, and for the type it's going to have when you dereference it. 无论是int *char *struct sockaddr_in *都只是编译器的信息,它知道递增时实际要添加的数字,以及取消引用时要具有的类型。

Your hypothesis is correct: to see how different kinds of pointer are handled, try running this program: 您的假设是正确的:要查看如何处理不同类型的指针,请尝试运行以下程序:

int main()
{
    char * pc = 0;
    int * pi = 0;

    printf("%p\n", pc + 1);
    printf("%p\n", pi + 1);

    return 0;
}

You will note that adding one to a char* increased its numeric value by 1, while doing the same to the int* increased by 4 (which is the size of an int on my machine). 您会注意到,向char *添加1将其数值增加1,而对int *进行相同的增加4(这是我机器上int的大小)。

It's exactly as you say in the end - types in C are just a compile-time concept that tells to the compiler how to generate the code for the various operations you can perform on variables. 就像您最后所说的那样-C语言中的类型只是一个编译时概念,它告诉编译器如何为可以对变量执行的各种操作生成代码。

In the end pointers just boil down to the address they point to, the semantic information doesn't exist anymore once the code is compiled. 最后,指针仅归结为它们指向的地址,一旦编译代码,语义信息就不再存在。

Incrementing an int* pointer is different from a incrementing char* solely because the pointer variable is declared as int*. 递增int *指针与递增char *的不同之处仅在于,将指针变量声明为int *。 You can cast an int* to char* and then it will increment with 1 byte. 您可以将int *转换为char *,然后它将以1个字节递增。

So, yes, it is all just syntactic sugar. 所以,是的,它们全都是语法糖。 It makes some kinds of array processing easier and confuses void* users. 它使某些类型的数组处理更加容易,并使void *用户感到困惑。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM