简体   繁体   English

静态值在哪里存储在程序集中

[英]Where are static values stored in assembly

Here is a simple C code 这是一个简单的C代码

#include <stdio.h>

int a = 5;

static int b = 20;

int main(){

 int c = 30;

 return 0;
}

Compiled to assebly with no optimization: 编译后没有优化:

    .section    __TEXT,__text,regular,pure_instructions
    .macosx_version_min 10, 13
    .globl  _main                   ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
    .cfi_startproc
## %bb.0:
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    xorl    %eax, %eax
    movl    $0, -4(%rbp)
    movl    $30, -8(%rbp)
    popq    %rbp
    retq
    .cfi_endproc
                                        ## -- End function
    .section    __DATA,__data
    .globl  _a                      ## @a
    .p2align    2
_a:
    .long   5                       ## 0x5



My question is where is static int b = 20; 我的问题是static int b = 20;在哪里static int b = 20; in the above assembly? 在上面的组装中? I know they are supposed to be in the global section of the memory but I cannot find it in the compiled version. 我知道它们应该在内存的全局区域中,但是在编译版本中找不到。

Your code doesn't use b , and it's file-scoped so nothing in other files can use it. 您的代码不使用b ,并且它是文件作用域的,因此其他文件中没有任何内容可以使用它。 GCC doesn't bother to emit a definition for it. GCC不会为它发出定义。

To answer the title question: 要回答标题问题:
A non- const static / global variable (ie static storage class) variable with a non-zero initializer will go in .section .data , as opposed to .bss (zero-init mutable), or .rdata (Windows) / .rodata (Linux) for non-zero read-only data. 具有非零初始值设定项的非const静态/全局变量(即静态存储类)变量将进入.section .data ,而不是.bss (零初始可变)或.rdata (Windows)/ .rodata (Linux)用于非零的只读数据。


gcc doesn't have a fully braindead mode that transliterates to asm naively. gcc没有完全天真烂漫的拼写为asm的模式。 See Disable all optimization options in GCC - GCC always has to transform through its internal representations. 请参阅禁用GCC中的所有优化选项 -GCC始终必须通过其内部表示进行转换。

GCC always does a pass that leaves out unused stuff even at -O0 . GCC始终会进行一次传递,即使在-O0也会遗漏未使用的东西。 There might be a way to disable that, unlike some of the other transformations gcc does even at -O0 . 可能有一种方法可以禁用它,这与gcc甚至在-O0处执行的其他转换不同。

gcc and clang -O0 compile each statement to a separate block of asm that stores/reloads everything ( for consistent debugging ), but within that block gcc still applies its standard transformations, like (x+y) < x becoming y<0 for signed x and y with gcc8 and newer , or x / 10 into a multiply + shift of the high half. gcc和clang -O0每个语句编译到一个单独的asm块中,该块存储/重新加载所有内容( 以进行一致的调试 ),但是gcc仍在该块中应用其标准转换,例如(x+y) < x变为y<0表示有符号x和y 使用gcc8及更高版本 ,或x / 10转换为上半部分的乘+移。 ( Why does GCC use multiplication by a strange number in implementing integer division? ). 为什么GCC在实现整数除法时使用乘以奇数的乘法? )。

And code inside if(false) is removed by gcc even at -O0 , so you can't jump to it in GDB. 而且即使在-O0 ,gcc也会删除if(false)代码,因此您不能在GDB中jump到它。

Some people care about runtime performance of debug builds, especially developers of real-time software like games or operating systems that's not properly testable if it runs too slowly. 有些人关心调试版本的运行时性能,尤其是实时软件(例如游戏或操作系统)的开发人员,如果运行速度太慢,则无法正确测试。 (Human interaction in games, or maybe device drivers in OSes.) (游戏中的人机交互,或操作系统中的设备驱动程序。)


Some other compilers are more braindead at -O0 , so you do often see asm that looks even more like the source expressions. 其他一些编译器在-O0更容易死 ,因此您经常会看到asm看起来更像源表达式。 I think I've seen MSVC without optimization emit instructions that did mov -immediate into a register, then cmp reg,imm , ie do a branch at runtime that only depends on immediate, and thus could trivially have been computed at compile time within that expression. 我想我已经看到没有优化的MSVC会发出将mov -immediate存入寄存器,然后cmp reg,imm指令,即在运行时执行仅依赖于立即执行的分支,因此可以在编译时对其进行琐碎的计算表达。

And of course there are truly non-optimizing compilers whose entire goal is just to transliterate with fixed patterns. 当然,确实有一些非优化的编译器,其总体目标只是使用固定模式进行音译。 For example, the Tiny C Compiler I think is pretty much one-pass, and emits asm (or machine code) as it goes along. 例如,我认为Tiny C编译器几乎是一次通过的,并且随着运行会发出asm(或机器代码)。 See Tiny C Compiler's generated code emits extra (unnecessary?) NOPs and JMPs shows just how simplistic it is: it always emits a sub esp, imm32 in function prologues, and only comes back to fill in the immediate at the end of the function once it knows how much stack the function needs. 参见Tiny C编译器生成的代码会发出额外的(不必要的?)NOP和JMP显示出它是多么的简单:它总是在功能序言中发出sub esp, imm32 ,并且仅返回一次以函数的结尾立即填充它知道函数需要多少堆栈。 Even if the answer is zero, it can't remove it and tighten up the code. 即使答案为零,也无法删除它并收紧代码。


It's usually more interesting to look at optimized asm anyway. 无论如何,通常来看优化的asm会更有趣。 Write functions that take args and return a value, so you can see the interesting part of the asm without a lot of boilerplate and store/reload noise. 编写带有args并返回值的函数,因此您可以看到asm有趣的部分,而不会产生很多样板内容和存储/重载噪声。 How to remove "noise" from GCC/clang assembly output? 如何从GCC / c装配件输出中消除“噪音”?

If a static variable hasn't been optimized out by the compiler, it will go in the process' default data section. 如果编译器未优化static变量,它将进入进程的默认数据部分。

In assembly, that can normally be controlled by the programmer in a section of the file designated for describing the data section. 在汇编中,通常可以由程序员在指定用于描述数据部分的文件部分中进行控制。

The C Standard says in § 6.2.4 paragraph 3: C标准在第6.2.4节第3段中说:

An object whose identifier is declared ... with the storage-class specifier static, has static storage duration. 使用存储类说明符为static声明其标识符的对象的静态存储持续时间。 Its lifetime is the entire execution of the program and its stored value is initialized only once, prior to program startup. 它的生命周期是程序的整个执行过程,并且在程序启动之前,它的存储值仅初始化一次。

With the following code: 使用以下代码:

static int a = 100;

int foo()
{
    return (a / 2);
}

Look at how the symbol _a appears in the _DATA segment for MSVC , lines 27-30 for GCC , and lines 28-30 for Clang . 看看符号_a如何出现在MSVC_DATA段中, GCC的第27-30行以及Clang的第28-30行。

The whole question is a bit inaccurate... (rereading it, you are actually very specific about "in the assembly above" ... oh well, then the answer is "nowhere" .. and rest of my answer is for the question which was not posted, but hopefully explaining why "nowhere" is answer for your question). 整个问题有点不准确...(重读它,您实际上对“在上面的程序集中”非常具体。。。哦,那么答案是“无处” ..而我的其余答案是针对这个问题的。尚未发布,但希望能解释为什么“无处”可以回答您的问题)。

You have C source, and then you show some assembly as compiler output (but you don't specify compiler) and then you ask about Assembly... 您有C源代码,然后将某些程序集显示为编译器输出(但未指定编译器),然后询问汇编程序...

The C is being defined upon "C abstract machine", while you are looking at particular x86-64 implementation of such abstract machine. C是在“ C抽象机”上定义的,而您正在研究此类抽象机的特定x86-64 实现

While that implementation does have some rules where static variables usually end up, it depends completely on the compiler - how it wants to implement them. 尽管该实现确实有一些通常会结束静态变量的规则,但它完全取决于编译器-如何实现它们。

In pure Assembly (like hand-written, or from CPU point of view) there's no such thing as "static value". 在纯汇编语言中(例如手写的或从CPU的角度来看),没有“静态值”之类的东西。 You have only registers, memory and peripherals. 您只有寄存器,存储器和外围设备。

So in Assembly (machine code) you can use certain register or certain part of memory as static variable. 因此,在汇编(机器代码)中,您可以将某些寄存器或内存的某些部分用作静态变量。 Whichever suits your needs better (there is no hard rule which would force you to do it in any particular way, except you must express your idea within the valid machine code for target CPU, but that usually means there are billions of possibilities and even when constraining yourself to only "reasonable" ones, it's still more toward tens of possible ways than only single). 无论哪种方法都更适合您的需求(没有硬性规则会迫使您以任何特定方式进行操作,除非您必须在目标CPU的有效机器代码中表达您的想法,但这通常意味着数十亿种可能性,甚至当将自己限制为仅“合理”的方法,它比仅仅单一方法更倾向于数十种可能的方法。

You can (in x86-64) even create a bit convoluted scheme how to keep the value as code-state ("part of memory" is then the memory occupied by the machine code), ie it would be not directly written in memory as a value, but the code would follow certain code paths (from many possible) to obtain correct final result, ie encoding the value in the code itself. 您甚至可以(在x86-64中)创建位卷积方案,如何将值保持为代码状态(“部分内存”就是机器代码占用的内存),即,它不会像一个值,但是代码将遵循某些代码路径(从许多可能的路径)以获得正确的最终结果,即在代码本身中对值进行编码。 There's for example Turing-complete way how to compile C source into x86-64 machine code using only mov instruction, which maybe doesn't use memory for static variables (not sure, whether it adds .data section or avoid it by compiling it into mov code too, but from its sheer existence it should be quite obvious how the .data can be theoretically avoided). 例如,有一个图灵完备的方法,如何仅使用mov指令将C源代码编译为x86-64机器代码,该指令可能不使用内存作为静态变量(不确定,是否添加.data节或通过将其编译为mov代码也是如此,但是从其纯粹的存在性来看,应该如何在理论上避免.data应该是很明显的。

So you are either asking how particular C compiler with particular compile time options implements static values (and that may have some variants depending on the source and options used)... 因此,您要么在问具有特定编译时间选项的特定C编译器如何实现静态值(并且取决于所使用的源和选项,它们可能会有一些变化)...

... or if you are really asking about "where are static values stored in assembly", then the answer is "anywhere you wish, as long as your machine code is valid and correct" , as the whole "static value" concept is of higher level than CPU operates at, so it's like interpretation of particular machine code purpose "that's the static value", but there's no specific instruction/support in CPU to handle that. ...或者如果您真的在问“程序集中存储的静态值在哪里”,那么答案是“只要您的机器代码有效且正确您希望在任何地方” ,因为整个“静态值”概念是级别高于CPU所处的工作水平,因此就像对特定机器代码用途的解释“即静态值”一样,但是CPU中没有特定的指令/支持来处理该问题。

Static variables are not stored in the memory. 静态变量不存储在内存中。 They will appear only when used For example 它们仅在使用时才会显示。例如

static int b = 20; 静态整数b = 20; c = c + b; c = c + b;

will compile 将编译

add c, '20' 加c,'20'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM