简体   繁体   English

C ++如何在内存中访问变量?

[英]C++ how are variables accessed in memory?

When I create a new variable in a C++ program, eg a char: 当我在C ++程序中创建新变量时,例如char:

char c = 'a';

how does C++ then have access to this variable in memory? 那么,C ++如何访问内存中的此变量? I would imagine that it would need to store the memory location of the variable, but then that would require a pointer variable, and this pointer would again need to be accessed. 我可以想象它需要存储该变量的内存位置,但是那将需要一个指针变量,并且该指针将再次需要访问。

See the docs : 文档

When a variable is declared, the memory needed to store its value is assigned a specific location in memory (its memory address). 声明变量后,将存储其值所需的内存分配给内存中的特定位置(其内存地址)。 Generally, C++ programs do not actively decide the exact memory addresses where its variables are stored. 通常,C ++程序不会主动决定存储其变量的确切内存地址。 Fortunately, that task is left to the environment where the program is run - generally, an operating system that decides the particular memory locations on runtime. 幸运的是,该任务留给了程序运行的环境-通常,一个操作系统会在运行时决定特定的内存位置。 However, it may be useful for a program to be able to obtain the address of a variable during runtime in order to access data cells that are at a certain position relative to it. 但是,对于程序而言,能够在运行时获取变量的地址以访问相对于其特定位置的数据单元可能很有用。

You can also refer this article on Variables and Memory 您还可以在变量和内存上参考本文

The Stack 堆栈

The stack is where local variables and function parameters reside. 堆栈是本地变量和函数参数所在的位置。 It is called a stack because it follows the last-in, first-out principle. 之所以称为堆栈是因为它遵循后进先出的原则。 As data is added or pushed to the stack, it grows, and when data is removed or popped it shrinks. 随着数据被添加或推入堆栈,它会增长,而当数据被删除或弹出时,它会收缩。 In reality, memory addresses are not physically moved around every time data is pushed or popped from the stack, instead the stack pointer, which as the name implies points to the memory address at the top of the stack, moves up and down. 实际上,每次从堆栈中推入或弹出数据时,内存地址都不是物理移动的,而是上下移动指针(顾名思义,该指针指向堆栈顶部的内存地址)。 Everything below this address is considered to be on the stack and usable, whereas everything above it is off the stack, and invalid. 该地址以下的所有内容都被视为在堆栈上且可用,而该地址上方的所有内容均在堆栈之外且无效。 This is all accomplished automatically by the operating system, and as a result it is sometimes also called automatic memory. 这都是由操作系统自动完成的,因此有时也称为自动内存。 On the extremely rare occasions that one needs to be able to explicitly invoke this type of memory, the C++ key word auto can be used. 在极少数情况下,需要能够显式调用这种类型的内存,可以使用C ++关键字auto。 Normally, one declares variables on the stack like this: 通常,这样在堆栈上声明变量:

 void func () { int i; float x[100]; ... } 

Variables that are declared on the stack are only valid within the scope of their declaration. 在堆栈上声明的变量仅在其声明范围内有效。 That means when the function func() listed above returns, i and x will no longer be accessible or valid. 这意味着当上面列出的函数func()返回时,i和x将不再可访问或有效。

There is another limitation to variables that are placed on the stack: the operating system only allocates a certain amount of space to the stack. 放置在堆栈上的变量还有另一个限制:操作系统仅向堆栈分配一定数量的空间。 As each part of a program that is being executed comes into scope, the operating system allocates the appropriate amount of memory that is required to hold all the local variables on the stack. 随着正在执行的程序的每个部分进入范围,操作系统都会分配适当的内存量,以将所有局部变量保存在堆栈上。 If this is greater than the amount of memory that the OS has allowed for the total size of the stack, then the program will crash. 如果这大于操作系统为堆栈总大小所允许的内存量,则该程序将崩溃。 While the maximum size of the stack can sometimes be changed by compile time parameters, it is usually fairly small, and nowhere near the total amount of RAM available on a machine. 尽管有时可以通过编译时参数来更改堆栈的最大大小,但是它通常很小,并且与机器上可用的RAM总量不相上下。

C++ itself (or, the compiler) would have access to this variable in terms of the program structure, represented as a data structure. C ++本身(或编译器)可以根据程序结构(表示为数据结构)访问此变量。 Perhaps you're asking how other parts in the program would have access to it at run time. 也许您是在问程序中的其他部分在运行时将如何访问它。

The answer is that it varies. 答案是,它有所不同。 It can be stored either in a register, on the stack, on the heap, or in the data/bss sections (global/static variables), depending on its context and the platform it was compiled for: If you needed to pass it around by reference (or pointer) to other functions, then it would likely be stored on the stack. 可以将其存储在寄存器中,堆栈中,堆中或data / bss部分(全局/静态变量)中,具体取决于其上下文和为其编译的平台:如果需要传递它通过引用(或指向)其他函数,则可能会将其存储在堆栈中。 If you only need it in the context of your function, it would probably be handled in a register. 如果仅在函数的上下文中需要它,则可能需要在寄存器中进行处理。 If it's a member variable of an object on the heap, then it's on the heap, and you reference it by an offset into the object. 如果它是堆上对象的成员变量,则它在堆上,您可以通过对象中的偏移量引用它。 If it's a global/static variable, then its address is determined once the program is fully loaded into memory. 如果它是一个全局/静态变量,则在程序完全加载到内存后就确定其地址。

C++ eventually compiles down to machine language, and often runs within the context of an operating system, so you might want to brush up a bit on Assembly basics, or even some OS principles, to better understand what's going on under the hood. C ++最终会编译为机器语言,并且通常在操作系统的上下文中运行,因此您可能需要略微了解Assembly的基础知识甚至某些OS原理,以更好地了解幕后的情况。

Assuming this is a local variable, then this variable is allocated on the stack - ie in the RAM. 假设这是一个局部变量,则该变量分配在堆栈上-即RAM中。 The compiler keeps track of the variable offset on the stack. 编译器跟踪堆栈上的变量偏移量。 In the basic scenario, in case any computation is then performed with the variable, it is moved to one of the processor's registers and the CPU performs the computation. 在基本情况下,如果随后使用该变量执行任何计算,则将其移至处理器的寄存器之一,然后CPU执行该计算。 Afterwards the result is returned back to the RAM. 之后,结果将返回到RAM。 Modern processors keep whole stack frames in the registers and have multiple levels of registers, so it can get quite complex. 现代处理器将整个堆栈帧保存在寄存器中,并具有多个级别的寄存器,因此它可能变得相当复杂。

Please note the "c" name is no more mentioned in the binary (unless you have debugging symbols). 请注意,二进制文件中不再提及“ c”名称(除非您具有调试符号)。 The binary only then works with the memory locations. 然后,二进制文件仅适用于存储位置。 Eg it would look like this (simple addition): 例如,它看起来像这样(简单添加):

a = b + c

take value of memory offset 1 and put it in the register 1
take value of memory offset 2 and put in in the register 2
sum registers 1 and 2 and store the result in register 3
copy the register 3 to memory location 3

The binary doesn't know "a", "b" or "c". 二进制文件不知道“ a”,“ b”或“ c”。 The compiler just said "a is in memory 1, b is in memory 2, c is in memory 3". 编译器只是说“ a在内存1中,b在内存2中,c在内存3中”。 And the CPU just blindly executes the commands the compiler has generated. CPU只是盲目地执行编译器生成的命令。

Lets say our program starts with a stack address of 4000000 可以说我们的程序以4000000的堆栈地址开始

When, you call a function, depending how much stack you use, it will "allocate it" like this 调用函数时,根据使用的堆栈数量,它会像这样“分配”它

Let's say we have 2 ints (8bytes) 假设我们有2个整数(8字节)

int function()
{
    int a = 0;
    int b = 0;
}

then whats gonna happen in assembly is 那么组装中会发生什么

MOV EBP,ESP //Here we store the original value of the stack address (4000000) in EBP, and we restore it at the end of the function back to 4000000 MOV EBP,ESP //这里我们将堆栈地址的原始值(4000000)存储在EBP中,并在函数结束时将其恢复为4000000

SUB ESP, 8 //here we "allocate" 8 bytes in the stack, which basically just decreases the ESP addr by 8 SUB ESP, 8 //这里,我们在堆栈中“分配” 8个字节,这基本上只是将ESP地址减少8

so our ESP address was changed from 4000000 to 3999992 因此我们的ESP地址从4000000更改为3999992

that's how the program knows knows the stack addresss for the first int is "3999992" and the second int is from 3999996 to 4000000 程序就是这样知道第一个int的堆栈地址是“ 3999992”,第二个int从3999996到4000000

Even tho this pretty much has nothing to do with the compiler, it's really important to know because when you know how stack is "allocated", you realize how cheap it is to do things like 即使这几乎与编译器无关,也要知道这一点非常重要,因为当您知道如何“分配”堆栈时,您会意识到执行诸如

char my_array[20000]; char my_array [20000]; since all it's doing is just doing sub esp, 20000 which is a single assembly instruction 因为它所做的只是做sub esp,20000,这是一条汇编指令

but if u actually use all those bytes like memset(my_array,20000) that's a different history. 但是如果您实际上使用了所有这些字节(如memset(my_array,20000)),那将是不同的历史记录。

how does C++ then have access to this variable in memory? 那么,C ++如何访问内存中的此变量?

It doesn't! 不会!

Your computer does, and it is instructed on how to do that by loading the location of the variable in memory into a register. 您的计算机可以这样做,并且通过将变量在内存中的位置加载到寄存器中来指示如何执行此操作。 This is all handled by assembly language . 这全部由汇编语言处理。 I shan't go into the details here of how such languages work (you can look it up!) but this is rather the purpose of a C++ compiler: to turn an abstract, high-level set of "instructions" into actual technical instructions that a computer can understand and execute. 我不会在这里详细介绍这些语言如何工作(您可以查找!),但这只是C ++编译器的目的:将一组抽象的高级“指令”转换为实际的技术指令。计算机可以理解和执行的。 You could sort of say that assembly programs contain a lot of pointers, though most of them are literals rather than "variables". 您可以说汇编程序包含很多指针,尽管它们大多数是文字而不是“变量”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM