简体   繁体   English

基于堆栈的虚拟机功能调用/返回实现问题

[英]Stack-based virtual machine function call/return implementation issues

Today I decided to create a little stack-based virtual machine in C++11 for fun - everything was going pretty well until I got to function calling and returning from functions. 今天,我决定在C ++ 11中创建一个小型的基于堆栈的虚拟机,以达到有趣的目的-一切进展顺利,直到我能够进行函数调用和从函数返回为止。

I've been trying to follow calling guidelines similar to x86 assembly but I'm getting really confused. 我一直在尝试遵循类似于x86汇编的调用准则,但是我感到非常困惑。

I have trouble dealing with stack base pointer offsets and with return values . 我在处理堆栈基指针偏移量返回值时遇到麻烦。

It seems very hard to keep track of registers used for return values and of arguments (for the function calls) on the stack. 似乎很难跟踪用于返回值的寄存器和堆栈上的参数(用于函数调用)。

I've created a simple assembly-like language and compiler. 我创建了一种简单的类似于汇编的语言和编译器。 Here's a commented example (that my virtual machine compiles and executes). 这是一个带有注释的示例(我的虚拟机可以编译并执行)。 I tried to explain what's happening and to share my thoughts in the comments. 我试图解释正在发生的事情,并在评论中分享我的想法。

//!ssvasm

$require_registers(3);

// C++ style preprocessor define directives to refer to registers
$define(R0, 0);
$define(R1, 1);
$define(R2, 2); 

// Load the 2.f float constant value into register R0
loadFloatCVToR(R0, 2.f);

// I want to pass 2.f as an argument to my next function call:
// I have to push it on top of the stack (as with x86 assembly)
pushRVToS(R0);

// I call the FN_QUAD function here: calling a function pushes both
// the current `stack base offset` and the `return instruction index`
// on the stack
callPI(FN_QUAD); 

// And get rid of the now-useless argument that still lies on top of the stack
// by dumping it into the unused R2 register 
popSVToR(R2);

halt(); // Halt virtual machine execution



$label(FN_DUP); // Function FN_DUP - returns its argument, duplicated

// I need the arg, but since it's beneath `old offset` and `return instruction`
// it has to copied into a register - I choose R0 - ...

// To avoid losing other data in R0, I "save" it by pushing it on the stack
// (Is this the correct way of saving a register's contents?)
pushRVToS(R0);

// To put the arg in R0, I need to copy the value under the top two stack values
// (Read as: "move stack value offset by 2 from base to R0")
// (Is this how I should deal with arguments? Or is there a better way?)
moveSBOVToR(R0, 2);

// Function logic: I duplicate the value by pushing it twice and adding
pushRVToS(R0); pushRVToS(R0); addFloat2SVs();

// The result is on top of the stack - I store it in R1, to get it from the caller
// (Is this how I should deal with return values? Or is there a better way?)
popSVToR(R1);

popSVToR(R0); // Restore R0 with its old value (it's now at the top of the stack)

// Return to the caller: this pops twice - it uses `old stack base offset` and
// unconditionally jumps to `return instruction index`
returnPI();



$label(FN_QUAD); // Function FN_QUAD

pushRVToS(R0);
moveSBOVToR(R0, 2);

// Call duplicate twice (using the first call's return value as the second
// call's argument)
pushRVToS(R0); callPI(FN_DUP); popSVToR(R2);
pushRVToS(R1); callPI(FN_DUP); popSVToR(R2);

popSVToR(R0);
returnPI();

I've never programmed in assembly before, so I'm not too sure the techniques I'm using are correct (or efficient). 我以前从未在汇编中编程过,所以我不太确定我使用的技术是否正确(或有效)。

Is the way I'm handling arguments/return values/registers correct? 我处理参数/返回值/寄存器的方式是否正确?

Should the caller of a function push the arguments, then call, then pop the arguments? 函数的调用者是否应该推送参数,然后调用,然后弹出参数? It seems that using a register would be easier, but I've read that x86 uses the stack to pass arguments. 似乎使用寄存器会更容易,但是我已经读到x86使用堆栈来传递参数。 I'm confident that the method I'm using here is incorrect. 我确信我在这里使用的方法不正确。

Should I push both old stack offset and return instruction index on a function call? 是否应该在函数调用中同时推送old stack offsetreturn instruction index Or should I store the old stack offset in a register? 还是应该将old stack offset存储在寄存器中? (Or avoid storing it at all?) (或者完全避免存储它?)

I solved this problem in my stack machine I've been working on, in the following way: 我以下列方式在我一直在研究的堆栈机中解决了这个问题:

A void function call (with no parameters) instruction does something like this: 无效函数调用(无参数)指令执行以下操作:

There is _stack[] (the main stack), and a _cstack[] (the call stack, containing information about calls, such as return size). 有_stack [](主​​堆栈)和_cstack [](调用堆栈,其中包含有关调用的信息,例如返回大小)。

When calling a function, (the VCALL (void function call) is encountered) the following is done: 调用函数时(遇到VCALL (无效函数调用)),请执行以下操作:

        u64& _next = _peeknext; //refer to next bytecode (which will be function address)
        AssertAbort((_next > -1) && (_next < _PROGRAM_SIZE), "Can't call function. Invalid address");
        cstack_push(ip + 2); //address to return to (current address +2, to account for function parameters next to function call)
        cstack_push(fp); //curr frame pointer
        cstack_push(_STACK_SIZE); //curr stack size
        cstack_push(0); //size of return value(would be 4 if int, 8 for long etc),in this case void
        ip = (_next)-1; //address to jump to (-1 to counter iteration incrementation of program counter(ip))

Then, when a RET (return) instruction is encountered, the following is done: 然后,当遇到RET (返回)指令时,将执行以下操作:

        AssertAbort(cstackhas(3), "Can't return. No address to return to.");
        u64 return_size = cstack_pop(); // pop size of return value form call stack
        _STACK_SIZE = cstack_pop(); //set the stack size to what it was before the function call, not accounting for the return value size
        fp = cstack_pop(); //reset the frame pointer to the current value to where it was before the function call
        ip = cstack_pop() - 1; //set program counter to addres storedon call stack from last function call

        _cstack.resize(_STACK_SIZE + return_size); //leave the top of the stack intact (size of return value in bytes), but disregard the rest.

This is probably useless to you now, as this question is quite old, but you can ask any questions if you wish :) 现在这个问题对您来说已经没有用了,因为这个问题已经很久了,但是您可以问任何问题:)

What you're talking about is calling the call convention. 您在说的是所谓的通话惯例。 In other words defining who builds the stack and how, caller or callee, and how should the stack look like. 换句话说,定义谁来构建堆栈以及如何构建堆栈,调用方或被调用方以及堆栈的外观。

They are many ways to do it and no one is better than the other, you just have to keep it conscistent. 他们有很多方法可以做到这一点,没有一个比另一个更好,您只需要保持它的简洁即可。

As it would be to long to describe the different call convetions, you should just check the wikipedia article that is really complete. 由于描述不同的呼叫约定将花费很长的时间,因此您应该只查看真正完整的Wikipedia文章。

But still quickly, the x86 C calling convention specifies that the caller must save its registers and build the stack and let the callee free of using the registers, to return a value or just simply to do things. 但仍然很快,x86 C调用约定指定了调用者必须保存其寄存器并构建堆栈,并使被调用者可以不使用寄存器,返回值或仅执行操作。

For the specific questions at the end of your post, the best is to have the same stack as C does, storing inside the last EIP and EBP and leave the registers free to use. 对于帖子末尾的特定问题,最好的做法是与C具有相同的堆栈,将其存储在最后的EIP和EBP中,并使寄存器可自由使用。 Stack space is not as limited as the number of registers you have. 堆栈空间不像您拥有的寄存器数那样有限。

The best solution depends on the machine. 最佳解决方案取决于机器。

If push and pop in the stack are as fast as using registers (on chip stack or on chip L1 baked stack) and at the same time you are very limited on the number of registers it would make sense to use the stack. 如果堆栈中的推入和弹出操作与使用寄存器(在芯片堆栈或在L1芯片上烘烤的堆栈上)的速度一样快,并且同时寄存器的数量非常有限,那么使用堆栈就很有意义。

If you have plenty of registers you can use some of them to store counters (pointers) or variables. 如果您有很多寄存器,则可以使用其中一些来存储计数器(指针)或变量。

In general to make modules communicate with each other or to translate (or compile) other languages into your assembly you should specify an Application Binary Interface. 通常,要使模块彼此通信或将其他语言翻译(或编译)成程序集,应指定一个应用程序二进制接口。

You should compare different ABIs for different hardware (or virtual machines) to find the techniques suitable for your machine. 您应该比较不同硬件(或虚拟机)的不同ABI,以找到适合您计算机的技术。 Once you define your ABI, programs should comply for binary compatibility. 定义ABI后,程序应符合二进制兼容性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM