简体繁体 English

从编译器到汇编器

[英]From Compiler to assembler

原文 2016-01-27 08:34:45 4 2 c/ assembly/ compiler-construction

I have a question regarding the assembler. 我对汇编器有疑问。 I was thinking of how the C function that takes multiple parameters as an argument is transformed into assembly. 我在考虑将多个参数作为参数的C函数如何转换为程序集。 So my question is, is there a subroutine in assembly that takes arguments as a parameter to operate? 所以我的问题是，汇编中是否有一个以参数为参数进行操作的子程序？ The code might look something like this: 该代码可能看起来像这样：

Call label1, R16. 呼叫label1，R16。 Where R16 is the subroutine input parameter. 其中R16是子例程输入参数。

If that's not the case then that means that EACH time the C function is called, it gets assembled into a subroutine with the parameters related to the specific call being substituted automatically in it. 如果不是这种情况，则意味着每次调用C函数时，它都会被汇编成一个子例程，该子例程中将自动替换与特定调用相关的参数。 That basically means that whenever a C function is called, the compiler transforms it into an inline function which am sure is not the case either :D 这基本上意味着，每当调用C函数时，编译器都会将其转换为内联函数，这肯定不是这种情况：D

So which is right? 那哪个是对的？ Thanks alot! 非常感谢！ :) :)

2 个解决方案

The compiler uses a "calling convention" which can be specific to that one compiler for that one target architecture (x86, arm, mips, pdp-11, etc). 编译器使用一种“调用约定”，该约定对于该一种目标体系结构（x86，arm，mips，pdp-11等）而言可能特定于该编译器。 For architectures with "plenty" of general purpose registers, the calling convention often starts with passing parameters in registers, and then uses the stack, for architectures with not a lot of registers the stack is primarily if not completely used for parameter passing and the return. 对于具有“大量”通用寄存器的体系结构，调用约定通常从在寄存器中传递参数开始，然后使用堆栈；对于没有很多寄存器的体系结构，如果不完全用于参数传递和返回，则堆栈主要用于。

The calling convention is a set of rules, such that if everyone follows the rules you can compile functions into objects and link them with other objects and they will be able to call each others functions or call themselves. 调用约定是一组规则，因此，如果每个人都遵循规则，则可以将函数编译为对象并将其与其他对象链接，它们将能够彼此调用函数或自行调用。

So it is a bit of a hybrid of what you were assuming. 因此，这与您的假设有点混杂。 The code built for that function is in some respects custom to that function as the number and type of parameters dictate what registers or how much stack is consumed and how. 在某些方面，为该函数构建的代码是该函数的自定义，因为参数的数量和类型决定了哪个寄存器或消耗了多少堆栈以及如何使用。 At the same time all functions conform to the same formula so they look more alike than different. 同时，所有函数都遵循相同的公式，因此它们看起来相似而不是不同。

On an arm for example you might have three integers being passed in to a function, they would for all the arm calling conventions I have seen (generally you find that even though it could vary across compilers it often doesnt or in the case of arm and mips and some others they try to dictate the convention for everyone rather than the compiler folks trying to do it) the first parameter in the C function would come in in r0, the second in r1 and third in r2. 例如，在arm上，您可能会将三个整数传递给函数，对于我见过的所有arm调用约定，它们都会被使用（通常，您会发现，即使它们在编译器之间可能有所不同，但通常不会，对于arm和mips和其他一些方法，他们试图为每个人（而不是试图这样做的编译器人员）规定惯例）C函数中的第一个参数将出现在r0中，第二个出现在r1中，而第三个出现在r2中。 If the first parameter were a 64 bit integer though then r0 and r1 are used for that first parameter and r2 gets the second and r3 the third, after r3 you use the stack, ordering of parameters on the stack is also dictated by the convention. 如果第一个参数是64位整数，则将r0和r1用于第一个参数，将r2获取第二个参数，将r3用作第三个参数，在使用堆栈后，按惯例，堆栈中参数的顺序也受规定。 So when a caller or a callee's code is compiled using the same C prototype then both sides know exactly where to find the parameters and construct the assembly language to do that. 因此，当使用相同的C原型编译调用方或被调用方的代码时，双方都确切地知道在哪里可以找到参数并构造汇编语言来做到这一点。

There might be some minimal options in some instruction sets, but in general that is not the case. 在某些指令集中可能有一些最小的选项，但通常情况并非如此。

Some assemblers have macros though that mimic procedural calls (usually with only a few registrable basetypes). 尽管一些汇编程序具有模仿程序调用的宏（通常只有少数可注册的基本类型），但它们仍具有宏。

And no, only in the case of inline functions a new function is generated with the parametrised with the parameters substituted. 不，仅在使用内联函数的情况下，才会生成新函数，其参数将替换为参数。

A compiler doesn't generate code for a procedure by textual substitution of parameters, but by putting all relevant parameters in registers or on the stack in a fixed regime called the "calling convention". 编译器不会通过文本的参数替换来为过程生成代码，而是通过将所有相关参数以固定的方式（称为“调用约定”）放入寄存器或堆栈中来生成。

The code that calculates and loads the parameters (in registers or on stack) is generated for each invocation, and the procedure/function remains unmodified and loads the parameters from where it knows it can find them 每次调用都会生成用于计算和加载参数（在寄存器中或堆栈上）的代码，并且过程/函数保持不变，并且从知道可以找到它们的位置加载参数