简体   繁体   中英

C++ garbage collection

There are a number of garbage collection libraries for C++.

I am kind of confused how the pointer tracking works.

In particular, suppose we have a base pointer P and a list of other pointers who are computed as offsets from P using an array.

Ex,

P2 = P+offset[0]

How does the garbage collector know P2 is still in scope? It has no direct reference but it's still accessible.

Probably the most popular C++ gc is

https://en.m.wikipedia.org/wiki/Boehm_garbage_collector

But following their example syntax it seems very easy to break so I must not be understanding something.

This question cannot be answered in general, but if we restrict the question to how the Boehm garbage collector recognizes pointers when it does its "mark" phase, basically the garbage collector knows all the areas of memory where user data is and it knows all of the pointers that it has allocated and how big those allocations were, and it just looks for chains of pointers starting from "root segments" as below, where by "look" we mean explicitly scanning memory for 64 bit values that are the same of one of the GC allocations it has done.

From here :

Since it cannot generally tell where pointer variables are located, it scans the following root segments for pointers:

  • The registers. Depending on the architecture, this may be done using assembly code, or by calling a setjmp-like function which saves register contents on the stack.
  • The stack(s). In the case of a single-threaded application, on most platforms this is done by scanning the memory between (an
    approximation of) the current stack pointer and GC_stackbottom. (For
    Itanium, the register stack scanned separately.) The GC_stackbottom
    variable is set in a highly platform-specific way depending on the
    appropriate configuration information in gcconfig.h. Note that the
    currently active stack needs to be scanned carefully, since
    callee-save registers of client code may appear inside collector
    stack frames, which may change during the mark process. This is
    addressed by scanning some sections of the stack "eagerly",
    effectively capturing a snapshot at one point in time.
  • Static data region(s). In the simplest case, this is the region between DATASTART and DATAEND, as defined in gcconfig.h. However, in
    most cases, this will also involve static data regions associated
    with dynamic libraries. These are identified by the mostly
    platform-specific code in dyn_load.c.

The address space for 64-bit pointers is huge so false positives will be rare, but even if they occur, false positives would just be leaks, that last as long as there happens to be some other variable in the memory the mark phase scans that is exactly the same value as some 64-bit pointer that was allocated by the garbage collector.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM