简体   繁体   中英

Weird behavior of reinterpret_cast

I just found a bug in my code of using reinterpret_cast, after I change it to dynamic_cast, the problem is gone. I then try to re-produce the behavior using the following code:

  1 #include <iostream>
  2
  3 using namespace std;
  4
  5 typedef enum {
  6     ONE, TWO,
  7 } Num;
  8
  9 class B {
 10     public:
 11         virtual Num f() const = 0;
 12 };
 13
 14 class D: public B {
 15     public:
 16         virtual Num f() const { return TWO; }
 17 };
 18
 19 int main()
 20 {
 21     B *b = new D();
 22     cout << "f()=" << reinterpret_cast<D*>(b)->f() << endl << endl;
 23
 24     return 0;
 25 }

This code is simplified version of the bug I just fixed, basically I try to re-produce the bug so that if I do not replace reinterpret_cast with dynamic_cast, a wild enum number is returned in line 22; after I change it to dynamic_cast, the the right enum is returned.

BUT the above code actually runs good, it does prints out "1" which is enum TWO.

Maybe my simplification has some problem, but do you see any chance the above code may have problem using reinterpret_case?

Yes, I know using reinterpret_case does not make sense, it is just I like to know what is going on.

You're seeing this because the B that is contained in a D does not exist at the same address. This probably has to do with the implementation of virtual method dispatch tables. The language makes no such guarantee since neither B nor D are POD types.

reinterpret_cast is basically you telling the compiler "take the bit pattern of this value and treat it as some other type, without changing it."

You touched [on] something which is what I am looking for, about the dispatch table, but it is not so clear, can you elaborate?

C++ does not dictate the particulars of a compiler's and/or ABI's implementation. Thus it's not guaranteed that reinterpret_cast and dynamic_cast will do the same thing. You've stumbled onto a case (single-inheritance hierarchy, as AndreyT pointed out) where it does on your compiler.

When you declare a class as having virtual members, it's no longer a POD type where the stored object's contents are exactly as written in the class declaration, because the compiler adds a hidden "virtual table pointer" to the start of storage. This pointer points to a per-class table containing the particulars of that class's virtual members, such as function pointers pointing to the virtual methods for that class. For example, you wrote:

class B {
    public:
        virtual Num f() const = 0;
};

but what is stored for a B is probably something like:

struct __VirtualsForB {
    Num (*f)(const B* this);
};

struct B {
    const __VirtualsForB* const __vtbl;
};

Then you write:

class D: public B {
    public:
        // BTW, don't need to say `virtual` here. Virtual-ness is inherited.
        virtual Num f() const { return TWO; }
};

and storage for that looks like:

struct __VirtualsForD {
    __VirtualsForB __super;
};

// This exists since D is not abstract.
extern const __VirtualsForD __virtuals_for_D;

struct D {
    const __VirtualsForD* const __vtbl;
};

Also, the compiler auto-generates some code and your virtual method (which probably can't be inlined even though you wrote it that way, since there needs to be a pointer to it for the virtual table to work):

Num __D__f(const B* __in_this)
{
    const D* this = static_cast<const D*>(__in_this);
    return TWO;
}

const __VirtualsForD __virtuals_for_D = { __D__f } ;

Then when you write:

B *b = new D();

that turns into something like:

// Allocate a D.
D* _new_D = (D*)operator new(sizeof(D));
// Construct the D.
_new_D.__vtbl = __virtuals_for_D;
B *b = static_cast<B*>(_new_D);

And due to some coincidental facts of this implementation:

  1. The first thing in the virtual table for D is the virtual table for B .
  2. The first thing in a B or D is a pointer to the object's class's virtual table.

it just so happens that reinterpret_cast and dynamic_cast do the same thing (namely, nothing) and your cout << reinterpret_cast<D*>(b)->f() succeeds. Which, by the way, turns into something like:

B* __temp = static_cast<B*>(reinterpret_cast<D*>(b));
Num __temp2 = (*__temp.__vtbl.f)(__temp);
std::ostream::operator<<(cout, __temp2);
// ...

If either of these conditions were not true, as is often the case in multiple inheritance or inheritance with virtual base classes, then reinterpret_cast would fail like you expected it to.

This is literally implementation-defined behavior.

reinterpret_cast will not handle the type dispatch correctly. If you already know the type you want to use, then use static_cast . If you don't know the type you want to use, then use dynamic_cast and make sure the pointer returned by dynamic_cast is valid.

In a single-inheritance hierarchy, when the very top class of the hierarchy is already polymorphic, all casts along the hierarchy do the same thing: nothing. They just "conceptually" reinterpret the pointer value as a value of different type. Only dynamic_cast will perform some additional checks when used for downcasts.

For this reason reinterpret_cast happens to "work" for this purpose. And there's no way to force it not to work, given your class definitions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM