简体   繁体   中英

what happens when typeid(obj) is compiled - C++

I have a sample class in my program like below

template<class T>
class MyTemplate1
{
public:
    T a;

    MyTemplate1(T other){
        a = other;
    }
};

In my main program, if I just create object of type MyTemplate1<int> , it is not showing any typeinfo objects in readelf output. But if I add some code like below

MyTemplate1<int> obj = 12;
if(typeid(obj) == typeid(MyTemplate1<float>))
   //some code

readelf output is showing typeinfo for MyTemplate1<int> and typeinfo for MyTemplate1<float> .

$readelf -s -W <objfile> | findstr -I "MyTemplate"
9023: 00000000     8 OBJECT  WEAK   DEFAULT 2899 _ZTI11MyTemplate1IfE
9024: 00000000     8 OBJECT  WEAK   DEFAULT 2894 _ZTI11MyTemplate1IiE

Could somebody please explain what these OBJECTs correspond to? Are these global instances of std::type_info for the class MyTemplate1? What exactly is happening under the hood?

You do not need to construct any objects instantiating MyTemplate1<T> in a compilation unit to see typeinfo objects describing instantiation classes of that template in the global symbol table of the object file. You need only to refer to the typeid of such a class:-

$ cat main.cpp
#include <typeinfo>

template<class T>
class MyTemplate1
{
public:
    T a;

    MyTemplate1(T other){
        a = other;
    }
};

int main(void)
{
    return (typeid(MyTemplate1<int>) == typeid(MyTemplate1<float>));
}

$ clang++ -Wall -c main.cpp
$ readelf -s -W main.o | grep MyTemplate1
     5: 0000000000000000    16 OBJECT  WEAK   DEFAULT   15 _ZTI11MyTemplate1IfE
     6: 0000000000000000    16 OBJECT  WEAK   DEFAULT   10 _ZTI11MyTemplate1IiE
     7: 0000000000000000    17 OBJECT  WEAK   DEFAULT   13 _ZTS11MyTemplate1IfE
     8: 0000000000000000    17 OBJECT  WEAK   DEFAULT    8 _ZTS11MyTemplate1IiE

$ c++filt _ZTI11MyTemplate1IfE
typeinfo for MyTemplate1<float>
$ c++filt _ZTI11MyTemplate1IiE
typeinfo for MyTemplate1<int>
$ c++filt _ZTS11MyTemplate1IfE
typeinfo name for MyTemplate1<float>
$ c++filt _ZTS11MyTemplate1IiE
typeinfo name for MyTemplate1<int>

These typeinfo objects are there because, as @Peter commented, the C++ standard requires that typeid refers to an object of static storage duration

What exactly is happening under the hood?

You may wonder: Why does the compiler make these typeinfo object symbols weak rather than simply global? Why does it define them in different sections of the object file? (sections 10 and 15 of my object file, sections 2894 and 2899 of yours).

And if we check what else is in these sections:

$ readelf -s main.o | egrep '(10 |15 )'
     5: 0000000000000000    16 OBJECT  WEAK   DEFAULT   15 _ZTI11MyTemplate1IfE
     6: 0000000000000000    16 OBJECT  WEAK   DEFAULT   10 _ZTI11MyTemplate1IiE

we see that each object is the only thing in its section. Why so?

In my main.o , those sections 10 and 15 are:

$ readelf -t main.o | egrep '(\[10\]|\[15\])'
  [10] .rodata._ZTI11MyTemplate1IiE
  [15] .rodata._ZTI11MyTemplate1IfE

Each of those is a read-only data-section in the sense of:

__attribute__((section(.rodata._ZTI11MyTemplate1IiE)))
__attribute__((section(.rodata._ZTI11MyTemplate1IfE)))

that contains nothing but the definition of the object after which it is named.

The compiler gives each of the objects a data-section to all to itself for the same reason that it makes the symbols WEAK . References to typeid(MyTemplate1<X>) , for an arbitrary type X , might be made in multiple translation units within the same linkage that #include the definition of MyTemplate1 . To head off linkage failure in such cases with multiple definition error, the compiler makes the symbols weak. The linker will tolerate multiple definitions of a weak symbol, resolving all references simply to the first definition that presents itself and ignoring the rest. By dedicating a unique data-section (or function-section, as appropriate) to the definition of each weak template-instantiating symbol the compiler gives the linker freedom to discard any surplus data-or-function sections that define the same weak symbol without risk of collateral damage to the program. See:

$ cat MyTemplate1.hpp
#pragma once

template<class T>
class MyTemplate1
{
public:
    T a;

    MyTemplate1(T other){
        a = other;
    }
};

$ cat foo.cpp
#include "MyTemplate1.hpp"
#include <typeinfo>

int foo()
{
    return typeid(MyTemplate1<int>) == typeid(MyTemplate1<float>);
}

$ cat bar.cpp
#include "MyTemplate1.hpp"
#include <typeinfo>

int bar()
{
    return typeid(MyTemplate1<int>) != typeid(MyTemplate1<float>);
}

$ cat prog.cpp
extern int foo();
extern int bar();

int main()
{
    return foo() && bar();
}

If we compile:

$ clang++ -Wall -c prog.cpp foo.cpp bar.cpp

and link (with some diagnostics) like this:

$ clang++ -o prog prog.o bar.o foo.o \
         -Wl,-trace-symbol=_ZTI11MyTemplate1IfE \
         -Wl,-trace-symbol=_ZTI11MyTemplate1IiE \
         -Wl,-Map=mapfile
/usr/bin/ld: bar.o: definition of _ZTI11MyTemplate1IfE
/usr/bin/ld: bar.o: definition of _ZTI11MyTemplate1IiE
/usr/bin/ld: foo.o: reference to _ZTI11MyTemplate1IfE
/usr/bin/ld: foo.o: reference to _ZTI11MyTemplate1IiE

inputting bar.o before foo.o , then the linker chooses the definitions of _ZTI11MyTemplate1I(f|i)E from bar.o and disregards the definitions in foo.o , resolving the references in foo.o to the definitions in bar.o . And the mapfile shows:

mapfile (1)

...
Discarded input sections
...
 .rodata._ZTI11MyTemplate1IiE
                0x0000000000000000       0x10 foo.o
...
 .rodata._ZTI11MyTemplate1IfE
                0x0000000000000000       0x10 foo.o
...

that the definitions in foo.o were thrown away. If we relink with the order of bar.o and foo.o reversed:

$ clang++ -o prog prog.o foo.o bar.o \
             -Wl,-trace-symbol=_ZTI11MyTemplate1IfE \
             -Wl,-trace-symbol=_ZTI11MyTemplate1IiE \
             -Wl,-Map=mapfile
/usr/bin/ld: foo.o: definition of _ZTI11MyTemplate1IfE
/usr/bin/ld: foo.o: definition of _ZTI11MyTemplate1IiE
/usr/bin/ld: bar.o: reference to _ZTI11MyTemplate1IfE
/usr/bin/ld: bar.o: reference to _ZTI11MyTemplate1IiE

then we get the opposite results. The definitions from foo.o are linked and:

mapfile (2)

...
Discarded input sections
...
 .rodata._ZTI11MyTemplate1IiE
                0x0000000000000000       0x10 bar.o
...
 .rodata._ZTI11MyTemplate1IfE
                0x0000000000000000       0x10 bar.o
...

the ones in bar.o are thrown away. This first-come-first-served principle of the linker is fine because - and only because - the definition of template<class T> MyTemplate1 that the compiler found in the translation unit foo.cpp was identical to the one it found in bar.cpp , a condition that the C++ Standard requires , in the One Definition Rule , but which the C++ compiler can do nothing to enforce .

You can make essentially the same observations about template-instantiating symbols in general, and what you see with clang++ is essentially the same as what you'll see with g++.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM