简体   繁体   English

C 和 C++ 中的静态变量存储在哪里?

[英]Where are static variables stored in C and C++?

In what segment (.BSS, .DATA, other) of an executable file are static variables stored so that they don't have name collision?静态变量存储在可执行文件的哪个段(.BSS、.DATA、其他)中,以便它们没有名称冲突? For example:例如:


foo.c:                         bar.c:
static int foo = 1;            static int foo = 10;
void fooTest() {               void barTest() {
  static int bar = 2;            static int bar = 20;
  foo++;                         foo++;
  bar++;                         bar++;
  printf("%d,%d", foo, bar);     printf("%d, %d", foo, bar);
}                              }

If I compile both files and link it to a main that calls fooTest() and barTest repeatedly, the printf statements increment independently.如果我编译这两个文件并将其链接到重复调用 fooTest() 和 barTest 的 main,则 printf 语句会独立递增。 Makes sense since the foo and bar variables are local to the translation unit.有道理,因为 foo 和 bar 变量是翻译单元的本地变量。

But where is the storage allocated?但是存储分配在哪里?

To be clear, the assumption is that you have a toolchain that would output a file in ELF format.需要明确的是,假设您有一个可以输出 ELF 格式文件的工具链。 Thus, I believe that there has to be some space reserved in the executable file for those static variables.因此,我认为,必须一定的空间,为那些静态变量的可执行文件保留。
For discussion purposes, lets assume we use the GCC toolchain.出于讨论目的,假设我们使用 GCC 工具链。

Where your statics go depends on whether they are zero-initialized .你的静态去哪里取决于它们是否是零初始化的 zero-initialized static data goes in .BSS (Block Started by Symbol) , non-zero-initialized data goes in .DATA零初始化静态数据进入.BSS (Block Started by Symbol)非零初始化数据进入.DATA

When a program is loaded into memory, it's organized into different segments.当一个程序被加载到内存中时,它被组织成不同的段。 One of the segment is DATA segment .其中一个段是DATA 段 The Data segment is further sub-divided into two parts:数据段进一步细分为两部分:

Initialized data segment: All the global, static and constant data are stored here.初始化数据段:所有的全局、静态和常量数据都存储在这里。
Uninitialized data segment(BSS): All the uninitialized data are stored in this segment.未初始化数据段(BSS):所有未初始化的数据都存储在该段中。

Here is a diagram to explain this concept:这是一个解释这个概念的图表:

在此处输入图片说明


here is very good link explaining these concepts:这是解释这些概念的非常好的链接:

http://www.inf.udec.cl/~leo/teoX.pdf http://www.inf.udec.cl/~leo/teoX.pdf

In fact, a variable is tuple (storage, scope, type, address, value):实际上,变量是元组(存储、作用域、类型、地址、值):

storage     :   where is it stored, for example data, stack, heap...
scope       :   who can see us, for example global, local...
type        :   what is our type, for example int, int*...
address     :   where are we located
value       :   what is our value

Local scope could mean local to either the translational unit (source file), the function or the block depending on where its defined.局部作用域可能意味着翻译单元(源文件)、函数或块的局部,具体取决于其定义的位置。 To make variable visible to more than one function, it definitely has to be in either DATA or the BSS area (depending on whether its initialized explicitly or not, respectively).为了使变量对多个函数可见,它肯定必须在 DATA 或 BSS 区域中(分别取决于它是否显式初始化)。 Its then scoped accordingly to either all function(s) or function(s) within source file.然后将其范围相应地限定为源文件中的所有函数或函数。

The storage location of the data will be implementation dependent.数据的存储位置将取决于实现。

However, the meaning of static is "internal linkage".但是,静态的含义是“内部链接”。 Thus, the symbol is internal to the compilation unit (foo.c, bar.c) and cannot be referenced outside that compilation unit.因此,符号在编译单元 (foo.c, bar.c)内部,不能在该编译单元外部引用。 So, there can be no name collisions.因此,不会有名称冲突。

I don't believe there will be a collision.我不相信会发生碰撞。 Using static at the file level (outside functions) marks the variable as local to the current compilation unit (file).在文件级别(外部函数)使用 static 将变量标记为当前编译单元(文件)的本地变量。 It's never visible outside the current file so never has to have a name that can be used externally.它永远不会在当前文件之外可见,因此永远不必具有可以在外部使用的名称。

Using static inside a function is different - the variable is only visible to the function (whether static or not), it's just its value is preserved across calls to that function.函数使用 static 是不同的 - 变量仅对函数可见(无论是否为静态),它只是在调用该函数时保留其值。

In effect, static does two different things depending on where it is.实际上,静态根据它所在的位置做了两种不同的事情。 In both cases however, the variable visibility is limited in such a way that you can easily prevent namespace clashes when linking.但是,在这两种情况下,变量可见性都受到限制,因此您可以在链接时轻松防止命名空间冲突。

Having said that, I believe it would be stored in the DATA section, which tends to have variables that are initialized to values other than zero.话虽如此,我相信它会存储在DATA部分中,该部分往往具有初始化为非零值的变量。 This is, of course, an implementation detail, not something mandated by the standard - it only cares about behaviour, not how things are done under the covers.当然,这是一个实现细节,不是标准强制要求的——它只关心行为,而不关心事情是如何在幕后完成的。

in the "global and static" area :)在“全局和静态”区域:)

There are several memory areas in C++: C++中有几个内存区域:

  • heap
  • free store免费商店
  • stack
  • global & static全局和静态
  • const常量

See here for a detailed answer to your question:请参阅此处详细回答您的问题:

The following summarizes a C++ program's major distinct memory areas.下面总结了 C++ 程序的主要不同内存区域。 Note that some of the names (eg, "heap") do not appear as such in the draft [standard].请注意,某些名称(例如,“堆”)在草案 [标准] 中并未出现。

     Memory Area     Characteristics and Object Lifetimes
     --------------  ------------------------------------------------

     Const Data      The const data area stores string literals and
                     other data whose values are known at compile
                     time.  No objects of class type can exist in
                     this area.  All data in this area is available
                     during the entire lifetime of the program.

                     Further, all of this data is read-only, and the
                     results of trying to modify it are undefined.
                     This is in part because even the underlying
                     storage format is subject to arbitrary
                     optimization by the implementation.  For
                     example, a particular compiler may store string
                     literals in overlapping objects if it wants to.


     Stack           The stack stores automatic variables. Typically
                     allocation is much faster than for dynamic
                     storage (heap or free store) because a memory
                     allocation involves only pointer increment
                     rather than more complex management.  Objects
                     are constructed immediately after memory is
                     allocated and destroyed immediately before
                     memory is deallocated, so there is no
                     opportunity for programmers to directly
                     manipulate allocated but uninitialized stack
                     space (barring willful tampering using explicit
                     dtors and placement new).


     Free Store      The free store is one of the two dynamic memory
                     areas, allocated/freed by new/delete.  Object
                     lifetime can be less than the time the storage
                     is allocated; that is, free store objects can
                     have memory allocated without being immediately
                     initialized, and can be destroyed without the
                     memory being immediately deallocated.  During
                     the period when the storage is allocated but
                     outside the object's lifetime, the storage may
                     be accessed and manipulated through a void* but
                     none of the proto-object's nonstatic members or
                     member functions may be accessed, have their
                     addresses taken, or be otherwise manipulated.


     Heap            The heap is the other dynamic memory area,
                     allocated/freed by malloc/free and their
                     variants.  Note that while the default global
                     new and delete might be implemented in terms of
                     malloc and free by a particular compiler, the
                     heap is not the same as free store and memory
                     allocated in one area cannot be safely
                     deallocated in the other. Memory allocated from
                     the heap can be used for objects of class type
                     by placement-new construction and explicit
                     destruction.  If so used, the notes about free
                     store object lifetime apply similarly here.


     Global/Static   Global or static variables and objects have
                     their storage allocated at program startup, but
                     may not be initialized until after the program
                     has begun executing.  For instance, a static
                     variable in a function is initialized only the
                     first time program execution passes through its
                     definition.  The order of initialization of
                     global variables across translation units is not
                     defined, and special care is needed to manage
                     dependencies between global objects (including
                     class statics).  As always, uninitialized proto-
                     objects' storage may be accessed and manipulated
                     through a void* but no nonstatic members or
                     member functions may be used or referenced
                     outside the object's actual lifetime.

How to find it yourself with objdump -Sr如何使用objdump -Sr自己找到它

To actually understand what is going on, you must understand linker relocation.要真正了解发生了什么,您必须了解链接器重定位。 If you've never touched that, consider reading this post first .如果您从未接触过它,请考虑先阅读这篇文章

Let's analyze a Linux x86-64 ELF example to see it ourselves:我们来分析一个Linux x86-64 ELF的例子,自己看看:

#include <stdio.h>

int f() {
    static int i = 1;
    i++;
    return i;
}

int main() {
    printf("%d\n", f());
    printf("%d\n", f());
    return 0;
}

Compile with:编译:

gcc -ggdb -c main.c

Decompile the code with:反编译代码:

objdump -Sr main.o
  • -S decompiles the code with the original source intermingled -S反编译与原始源代码混合的代码
  • -r shows relocation information -r显示重定位信息

Inside the decompilation of f we see:f的反编译中,我们看到:

 static int i = 1;
 i++;
4:  8b 05 00 00 00 00       mov    0x0(%rip),%eax        # a <f+0xa>
        6: R_X86_64_PC32    .data-0x4

and the .data-0x4 says that it will go to the first byte of the .data segment. .data-0x4表示它将转到.data段的第一个字节。

The -0x4 is there because we are using RIP relative addressing, thus the %rip in the instruction and R_X86_64_PC32 . -0x4存在是因为我们使用 RIP 相对寻址,因此指令中的%ripR_X86_64_PC32

It is required because RIP points to the following instruction, which starts 4 bytes after 00 00 00 00 which is what will get relocated.这是必需的,因为 RIP 指向以下指令,该指令在00 00 00 00之后的 4 个字节开始,这将被重新定位。 I have explained this in more detail at: https://stackoverflow.com/a/30515926/895245我在以下位置更详细地解释了这一点: https : //stackoverflow.com/a/30515926/895245

Then, if we modify the source to i = 1 and do the same analysis, we conclude that:然后,如果我们将源修改为i = 1并进行相同的分析,我们得出的结论是:

  • static int i = 0 goes on .bss static int i = 0继续.bss
  • static int i = 1 goes on .data static int i = 1继续.data

This is how (easy to understand):这是如何(易于理解):

堆栈、堆和静态数据

It depends on the platform and compiler that you're using.这取决于您使用的平台和编译器。 Some compilers store directly in the code segment.一些编译器直接存储在代码段中。 Static variables are always only accessible to the current translation unit and the names are not exported thus the reason name collisions never occur.静态变量始终只能由当前翻译单元访问,并且名称不会导出,因此永远不会发生名称冲突的原因。

Data declared in a compilation unit will go into the .BSS or the .Data of that files output.在编译单元中声明的数据将进入该文件输出的 .BSS 或 .Data。 Initialised data in BSS, uninitalised in DATA.在 BSS 中初始化数据,在 DATA 中未初始化。

The difference between static and global data comes in the inclusion of symbol information in the file.静态数据和全局数据之间的区别在于文件中包含符号信息。 Compilers tend to include the symbol information but only mark the global information as such.编译器倾向于包含符号信息,但只标记全局信息。

The linker respects this information.链接器尊重这些信息。 The symbol information for the static variables is either discarded or mangled so that static variables can still be referenced in some way (with debug or symbol options).静态变量的符号信息要么被丢弃要么被破坏,这样静态变量仍然可以以某种方式被引用(使用调试或符号选项)。 In neither case can the compilation units gets affected as the linker resolves local references first.在这两种情况下,编译单元都不会受到影响,因为链接器首先解析本地引用。

I tried it with objdump and gdb, here is the result what I get:我用 objdump 和 gdb 尝试过,这是我得到的结果:

(gdb) disas fooTest
Dump of assembler code for function fooTest:
   0x000000000040052d <+0>: push   %rbp
   0x000000000040052e <+1>: mov    %rsp,%rbp
   0x0000000000400531 <+4>: mov    0x200b09(%rip),%eax        # 0x601040 <foo>
   0x0000000000400537 <+10>:    add    $0x1,%eax
   0x000000000040053a <+13>:    mov    %eax,0x200b00(%rip)        # 0x601040 <foo>
   0x0000000000400540 <+19>:    mov    0x200afe(%rip),%eax        # 0x601044 <bar.2180>
   0x0000000000400546 <+25>:    add    $0x1,%eax
   0x0000000000400549 <+28>:    mov    %eax,0x200af5(%rip)        # 0x601044 <bar.2180>
   0x000000000040054f <+34>:    mov    0x200aef(%rip),%edx        # 0x601044 <bar.2180>
   0x0000000000400555 <+40>:    mov    0x200ae5(%rip),%eax        # 0x601040 <foo>
   0x000000000040055b <+46>:    mov    %eax,%esi
   0x000000000040055d <+48>:    mov    $0x400654,%edi
   0x0000000000400562 <+53>:    mov    $0x0,%eax
   0x0000000000400567 <+58>:    callq  0x400410 <printf@plt>
   0x000000000040056c <+63>:    pop    %rbp
   0x000000000040056d <+64>:    retq   
End of assembler dump.

(gdb) disas barTest
Dump of assembler code for function barTest:
   0x000000000040056e <+0>: push   %rbp
   0x000000000040056f <+1>: mov    %rsp,%rbp
   0x0000000000400572 <+4>: mov    0x200ad0(%rip),%eax        # 0x601048 <foo>
   0x0000000000400578 <+10>:    add    $0x1,%eax
   0x000000000040057b <+13>:    mov    %eax,0x200ac7(%rip)        # 0x601048 <foo>
   0x0000000000400581 <+19>:    mov    0x200ac5(%rip),%eax        # 0x60104c <bar.2180>
   0x0000000000400587 <+25>:    add    $0x1,%eax
   0x000000000040058a <+28>:    mov    %eax,0x200abc(%rip)        # 0x60104c <bar.2180>
   0x0000000000400590 <+34>:    mov    0x200ab6(%rip),%edx        # 0x60104c <bar.2180>
   0x0000000000400596 <+40>:    mov    0x200aac(%rip),%eax        # 0x601048 <foo>
   0x000000000040059c <+46>:    mov    %eax,%esi
   0x000000000040059e <+48>:    mov    $0x40065c,%edi
   0x00000000004005a3 <+53>:    mov    $0x0,%eax
   0x00000000004005a8 <+58>:    callq  0x400410 <printf@plt>
   0x00000000004005ad <+63>:    pop    %rbp
   0x00000000004005ae <+64>:    retq   
End of assembler dump.

here is the objdump result这是 objdump 结果

Disassembly of section .data:

0000000000601030 <__data_start>:
    ...

0000000000601038 <__dso_handle>:
    ...

0000000000601040 <foo>:
  601040:   01 00                   add    %eax,(%rax)
    ...

0000000000601044 <bar.2180>:
  601044:   02 00                   add    (%rax),%al
    ...

0000000000601048 <foo>:
  601048:   0a 00                   or     (%rax),%al
    ...

000000000060104c <bar.2180>:
  60104c:   14 00                   adc    $0x0,%al

So, that's to say, your four variables are located in data section event the the same name, but with different offset.所以,也就是说,您的四个变量位于数据段事件中,名称相同,但偏移量不同。

static variable stored in data segment or code segment as mentioned before.如前所述,静态变量存储在数据段或代码段中。
You can be sure that it will not be allocated on stack or heap.您可以确定它不会在堆栈或堆上分配。
There is no risk for collision since static keyword define the scope of the variable to be a file or function, in case of collision there is a compiler/linker to warn you about.没有发生冲突的风险,因为static关键字将变量的范围定义为文件或函数,如果发生冲突,编译器/链接器会警告您。
A nice example一个很好的例子

那么这个问题有点太老了,但是因为没有人指出任何有用的信息:检查“mohit12379”的帖子,解释符号表中同名静态变量的存储: http : //www.geekinterview.com/question_details/ 24745

The answer might very well depend on the compiler, so you probably want to edit your question (I mean, even the notion of segments is not mandated by ISO C nor ISO C++).答案很可能取决于编译器,因此您可能想要编辑您的问题(我的意思是,即使 ISO C 和 ISO C++ 都没有强制要求段的概念)。 For instance, on Windows an executable doesn't carry symbol names.例如,在 Windows 上,可执行文件不带有符号名称。 One 'foo' would be offset 0x100, the other perhaps 0x2B0, and code from both translation units is compiled knowing the offsets for "their" foo.一个 'foo' 将偏移 0x100,另一个可能是 0x2B0,并且编译来自两个翻译单元的代码,知道“他们的”foo 的偏移量。

它们都将独立存储,但是如果您想让其他开发人员清楚地了解它们,您可能希望将它们包装在命名空间中。

you already know either it store in bss(block start by symbol) also referred as uninitialized data segment or in initialized data segment.您已经知道它要么存储在 bss(块以符号开头)中,也称为未初始化数据段或已初始化数据段。

lets take an simple example让我们举一个简单的例子

void main(void)
{
static int i;
}

the above static variable is not initialized , so it goes to uninitialized data segment(bss).上面的静态变量没有初始化,所以它转到未初始化的数据段(bss)。

void main(void)
{
static int i=10;
}

and of course it initialized by 10 so it goes to initialized data segment.当然,它初始化为 10,所以它会进入初始化的数据段。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM