简体   繁体   English

`* dynamic_cast是什么意思? <T*> (...)`?

[英]What is the meaning of `*dynamic_cast<T*>(…)`?

Recently I was looking in the code of an open source project, and I saw a bunch of statements of the form T & object = *dynamic_cast<T*>(ptr); 最近我查看了一个开源项目的代码,我看到了一堆形式为T & object = *dynamic_cast<T*>(ptr);的语句T & object = *dynamic_cast<T*>(ptr); .

(Actually this was occuring in macro used to declare many functions following a similar pattern.) (实际上这是在宏中发生的,用于声明遵循类似模式的许多函数。)

To me this looked like a code-smell. 对我而言,这看起来像是一种代码味道。 My reasoning was, if you know the cast will succeed, then why not use a static_cast ? 我的理由是,如果你知道演员阵容会成功,那么为什么不使用static_cast呢? If you aren't sure, then shouldn't you use an assert to test? 如果你不确定,那么你不应该使用断言进行测试吗? Since the compiler can assume that any pointer that you * is not null. 由于编译器可以假设您*任何指针都不为null。

I asked one of the devs on irc about it, and he said that, he considers static_cast downcast to be unsafe. 我问过一位关于irc的开发者,他说,他认为static_cast downcast是不安全的。 They could add an assert, but even if they don't, he says you will still get a null pointer dereference and crash when obj is actually used. 他们可以添加一个断言,但即使他们没有,他说你仍会得到一个空指针取消引用并在实际使用obj时崩溃。 (Because, on failure, the dynamic_cast will convert the pointer to null, then when you access any member, you will be reading from some address of value very close to zero, which the OS won't allow.) If you use a static_cast , and it goes bad, you might just get some memory corruption. (因为,失败时, dynamic_cast会将指针转换为null,然后当您访问任何成员时,您将从某个非常接近零的值的地址读取,操作系统将不允许这样做。)如果使用static_cast ,它变坏了,你可能只是得到一些内存损坏。 So by using the *dynamic_cast option, you are trading off speed for slightly better debuggability. 因此,通过使用*dynamic_cast选项,您可以获得更快的速度以获得更好的可调试性。 You aren't paying for the assert, instead you are basically relying on the OS to catch the nullptr dereference, at least that's what I understood. 你没有为断言付费,相反,你基本上依靠操作系统来捕获nullptr dereference,至少这是我所理解的。

I accepted that explanation at the time, but it bothered me and I thought about it some more. 我当时接受了那个解释,但它让我感到困扰,我又想了一些。

Here's my reasoning. 这是我的推理。

If I understand the standard right, a static_cast pointer cast basically means to do some fixed pointer arithmetic. 如果我理解标准权利, static_cast指针转换基本上意味着做一些固定的指针算术。 That is, if I have A * a , and I static cast it to a related type B * , what the compiler is actually going to do with that is add some offset to the pointer, the offset depending only on the layout of the types A , B , (and which C++ implementation potentially). 也就是说,如果我有A * a ,并且我将其静态转换为相关类型B * ,那么编译器实际上要做的是向指针添加一些偏移量,偏移量仅取决于类型的布局AB ,(以及潜在的C ++实现)。 This theory can be tested by static casting pointers to void * and outputting them, before and after the static cast. 这个理论可以通过静态转换指向void *并在静态转换之前和之后输出它们来测试。 I expect that if you look at the generated assembly, the static_cast will turn into "add some fixed constant to the register corresponding to the pointer." 我希望如果你看看生成的程序集, static_cast将变成“向指针对应的寄存器添加一些固定常量”。

A dynamic_cast pointer cast means, first check the RTTI and only do the static cast if it is valid based on the dynamic type. dynamic_cast指针强制转换意味着,首先检查RTTI,并且只有在基于动态类型有效时才进行静态强制转换。 If it is not, then return nullptr . 如果不是,则返回nullptr So, I'd expect that the compiler will at some point expand an expresion dynamic_cast<B*>(ptr) where ptr is of type A* into an expression like 所以,我希望编译器在某些时候将表达式dynamic_cast<B*>(ptr)扩展为ptr类型为A*的表达式

(__validate_dynamic_cast_A_to_B(ptr) ? static_cast<B*>(ptr) : nullptr)

However, if we then * the result of the dynamic_cast, * of nullptr is UB, so we are implicitly promising that the nullptr branch never happens. 但是,如果我们那么* dynamic_cast的结果, * nullptr是UB,那么我们暗中承诺nullptr分支永远不会发生。 And conforming compilers are permitted to "reason backwards" from that and eliminate null checks, a point driven home in Chris Lattner's famous blog post . 并且合规编译器被允许从中“向后推”并消除空检查,这是克里斯拉特纳着名的博客文章中的一个点驱动的家。

If the test function __validate_dynamic_cast_A_to_B(ptr) is opaque to the optimizer, ie it might have side effects, then the optimizer can't get rid of it, even if it "knows" the nullptr branch doesn't happen. 如果测试函数__validate_dynamic_cast_A_to_B(ptr)对优化器不透明,即它可能有副作用,那么优化器无法摆脱它,即使它“知道”nullptr分支没有发生。 However, probably this function is not opaque to the optimizer -- probably it has a very good understanding of its possible side effects. 但是,这个函数可能对优化器不透明 - 可能它对其可能的副作用有很好的理解。

So, my expectation is that the optimizer will essentially convert *dynamic_cast<T*>(ptr) into *static_cast<T*>(ptr) , and that interchanging these should give the same generated assembly. 所以,我的期望是优化器基本上将*dynamic_cast<T*>(ptr)转换为*static_cast<T*>(ptr) ,并且交换这些应该给出相同的生成程序集。

If true, that would justify my original argument that *dynamic_cast<T*> is a code smell, even if you don't really care about UB in your code and only care about what "actually" happens. 如果是真的,这就证明了我的原始论点*dynamic_cast<T*>是代码气味,即使你并不真正关心代码中的UB而只关心“实际”发生的事情。 Because, if a conforming compiler would be permitted to change it to a static_cast silently, then you aren't getting any safety that you think you are, so you should either explicitly static_cast or explicitly assert. 因为,如果允许符合标准的编译器以静默方式将其更改为static_cast ,那么您没有获得任何您认为的安全性,因此您应该显式static_cast或显式断言。 At least, that would be my vote in a code review. 至少,这将是我在代码审查中的投票。 I'm trying to figure out if that argument is actually right. 我想弄清楚这个论点是否真的正确。


Here is what the standard says about dynamic_cast : 以下是关于dynamic_cast的标准说明:

[5.2.7] Dynamic Cast [expr.dynamic.cast] [5.2.7]动态演员[expr.dynamic.cast]
1. The result of the expression dynamic_cast<T>(v) is the result of converting the expression v to type T . 1.表达式dynamic_cast<T>(v)的结果是将表达式v转换为类型T T shall be a pointer or reference to a complete class type, or "pointer to cv void." T应该是指向完整类类型的指针或引用,或者是“指向cv void的指针”。 The dynamic_cast operator shall not cast away constness. dynamic_cast运算符不应抛弃constness。
... ...
8. If C is the class type to which T points or refers, the run-time check logically executes as follows: 8.如果CT指向或引用的类类型,则运行时检查在逻辑上执行如下:
(8.1) - If, in the most derived object pointed (referred) to by v , v points (refers) to a public base class subobject of a C object, and if only one object of type C is derived from the subobject pointed (referred) to by v the result points (refers) to that C object. (8.1) -如果在最派生对象尖(简称)由vv点(指)到的公共基类子对象C对象,并且如果只有一个类型的对象C从子对象衍生指出(通过v将结果点(引用)引用到该C对象。
(8.2) - Otherwise, if v points (refers) to a public base class subobject of the most derived object, and the type of the most derived object has a base class, of type C , that is unambiguous and public, the result points (refers) to the C subobject of the most derived object. (8.2) - 否则,如果v指向(引用)最派生对象的公共基类子对象,并且最派生对象的类型具有类型C的基类,即明确且公开的,则结果指向(指)最派生对象的C子对象。
(8.3) - Otherwise, the run-time check fails. (8.3) - 否则,运行时检查失败。

Assuming that the hierarchy of classes is known at compile-time, the relative offsets of each of these classes within eachothers layouts are also known. 假设在编译时已知类的层次结构,则每个布局中的每个类的相对偏移量也是已知的。 If v is a pointer to type A , and we want to cast it to a pointer of type B , and the cast is unambiguous, then the shift that v must take is a compile-time constant. 如果v是指向类型A的指针,并且我们想将它转换为类型B的指针,并且转换是明确的,那么v必须采用的转移是编译时常量。 Even if v actually points to an object of a more derived type C , that fact doesn't change where the A subobject lies relative to the B subobject, right? 即使v实际指向更多派生类型C的对象,该事实也不会改变A子对象相对于B子对象的位置,对吧? So no matter what the type C is, even if it is some unknown type from another compilation unit, to my knowledge the result of a dynamic_cast<T*>(ptr) has only two possible values, nullptr or "fixed-offset from ptr ". 所以无论类型C是什么,即使它是来自另一个编译单元的某种未知类型,据我所知, dynamic_cast<T*>(ptr)只有两个可能的值, nullptr或“ ptr固定偏移量” ”。


However, the plot thickens somewhat upon actually looking at some code gen. 然而,在实际查看某些代码时,情节会有所增加。

Here's a simple program that I made to investigate this: 这是一个简单的程序,我用来调查这个:


int output = 0;

struct A {
  explicit A(int n) : num_(n) {}
  int num_;

  virtual void foo() {
    output += num_;
  }
};

struct B final : public A {
  explicit B(int n) : A(n), num2_(2 * n) {}

  int num2_;

  virtual void foo() override {
    output -= num2_;
  }
};

void visit(A * ptr) {
  B & b = *dynamic_cast<B*>(ptr);
  b.foo();
  b.foo();
}

int main() {
  A * ptr = new B(5); 

  visit(ptr);

  ptr = new A(10);
  visit(ptr);

  return output;
}

According to godbolt compiler explorer , gcc 5.3 x86 assembly for this, with options -O3 -std=c++11 , looks like this: 根据godbolt编译器资源管理器gcc 5.3 x86汇编,使用选项-O3 -std=c++11 ,如下所示:


A::foo():
        movl    8(%rdi), %eax
        addl    %eax, output(%rip)
        ret
B::foo():
        movl    12(%rdi), %eax
        subl    %eax, output(%rip)
        ret
visit(A*):
        testq   %rdi, %rdi
        je      .L4
        subq    $8, %rsp
        xorl    %ecx, %ecx
        movl    typeinfo for B, %edx
        movl    typeinfo for A, %esi
        call    __dynamic_cast
        movl    12(%rax), %eax
        addl    %eax, %eax
        subl    %eax, output(%rip)
        addq    $8, %rsp
        ret
.L4:
        movl    12, %eax
        ud2
main:
        subq    $8, %rsp
        movl    $16, %edi
        call    operator new(unsigned long)
        movq    %rax, %rdi
        movl    $5, 8(%rax)
        movq    vtable for B+16, (%rax)
        movl    $10, 12(%rax)
        call    visit(A*)
        movl    $16, %edi
        call    operator new(unsigned long)
        movq    vtable for A+16, (%rax)
        movl    $10, 8(%rax)
        movq    %rax, %rdi
        call    visit(A*)
        movl    output(%rip), %eax
        addq    $8, %rsp
        ret
typeinfo name for A:
typeinfo for A:
typeinfo name for B:
typeinfo for B:
vtable for A:
vtable for B:
output:
        .zero   4

When I change the dynamic_cast to a static_cast , I get the following instead: 当我将dynamic_cast更改为static_cast ,我得到以下内容:


A::foo():
        movl    8(%rdi), %eax
        addl    %eax, output(%rip)
        ret
B::foo():
        movl    12(%rdi), %eax
        subl    %eax, output(%rip)
        ret
visit(A*):
        movl    12(%rdi), %eax
        addl    %eax, %eax
        subl    %eax, output(%rip)
        ret
main:
        subq    $8, %rsp
        movl    $16, %edi
        call    operator new(unsigned long)
        movl    $16, %edi
        subl    $20, output(%rip)
        call    operator new(unsigned long)
        movl    12(%rax), %edx
        movl    output(%rip), %eax
        subl    %edx, %eax
        subl    %edx, %eax
        movl    %eax, output(%rip)
        addq    $8, %rsp
        ret
output:
        .zero   4

Here's the same with clang 3.8 and same options. 这与clang 3.8和相同选项相同。

dynamic_cast : dynamic_cast


visit(A*):                            # @visit(A*)
        xorl    %eax, %eax
        testq   %rdi, %rdi
        je      .LBB0_2
        pushq   %rax
        movl    typeinfo for A, %esi
        movl    typeinfo for B, %edx
        xorl    %ecx, %ecx
        callq   __dynamic_cast
        addq    $8, %rsp
.LBB0_2:
        movl    output(%rip), %ecx
        subl    12(%rax), %ecx
        movl    %ecx, output(%rip)
        subl    12(%rax), %ecx
        movl    %ecx, output(%rip)
        retq

B::foo():                            # @B::foo()
        movl    12(%rdi), %eax
        subl    %eax, output(%rip)
        retq

main:                                   # @main
        pushq   %rbx
        movl    $16, %edi
        callq   operator new(unsigned long)
        movl    $5, 8(%rax)
        movq    vtable for B+16, (%rax)
        movl    $10, 12(%rax)
        movl    typeinfo for A, %esi
        movl    typeinfo for B, %edx
        xorl    %ecx, %ecx
        movq    %rax, %rdi
        callq   __dynamic_cast
        movl    output(%rip), %ebx
        subl    12(%rax), %ebx
        movl    %ebx, output(%rip)
        subl    12(%rax), %ebx
        movl    %ebx, output(%rip)
        movl    $16, %edi
        callq   operator new(unsigned long)
        movq    vtable for A+16, (%rax)
        movl    $10, 8(%rax)
        movl    typeinfo for A, %esi
        movl    typeinfo for B, %edx
        xorl    %ecx, %ecx
        movq    %rax, %rdi
        callq   __dynamic_cast
        subl    12(%rax), %ebx
        movl    %ebx, output(%rip)
        subl    12(%rax), %ebx
        movl    %ebx, output(%rip)
        movl    %ebx, %eax
        popq    %rbx
        retq

A::foo():                            # @A::foo()
        movl    8(%rdi), %eax
        addl    %eax, output(%rip)
        retq

output:
        .long   0                       # 0x0

typeinfo name for A:

typeinfo for A:

typeinfo name for B:

typeinfo for B:

vtable for B:

vtable for A:

static_cast : static_cast


visit(A*):                            # @visit(A*)
        movl    output(%rip), %eax
        subl    12(%rdi), %eax
        movl    %eax, output(%rip)
        subl    12(%rdi), %eax
        movl    %eax, output(%rip)
        retq

main:                                   # @main
        retq

output:
        .long   0                       # 0x0

So, in both cases, it seems that dynamic_cast cannot be eliminated by the optimizer: 因此,在这两种情况下,似乎优化器都无法消除dynamic_cast

It seems to generate calls to a mysterious __dynamic_cast function, using the typeinfo of both classes, no matter what. 它似乎使用两个类的typeinfo生成对神秘的__dynamic_cast函数的调用,无论如何。 Even if all optimizations are on, and B is marked final. 即使所有优化都已开启, B也标记为最终。

  • Does this low-level call have side effects that I didn't consider? 这个低级别的呼叫是否有副作用,我没有考虑过? My understanding was that the vtables are essentially fixed and that the vptr in an object doesn't change... am I right? 我的理解是vtable基本上是固定的,并且对象中的vptr不会改变......我是对的吗? I have only basic familiarity with how vtables are actually implemented and tbh I usually avoid virtual functions in my code, so I haven't really thought deeply on it or accumulated experience. 我只是基本熟悉vtable的实际实现方式,而且我通常在代码中避免使用虚函数,所以我并没有真正深入思考它或积累经验。

  • Am I right that a conforming compiler could replace *dynamic_cast<T*>(ptr) with *static_cast<T*>(ptr) as a valid optimization? 我是否正确,符合标准的编译器可以*static_cast<T*>(ptr)替换*dynamic_cast<T*>(ptr) *static_cast<T*>(ptr)作为有效优化?

  • Is it true that "usually" (meaning, on x86 machines, let's say, and casting between classes in a hierarchy of "usual" complexity) a dynamic_cast cannot be optimized away, and will in fact produce a nullptr even if you * it right after, leading to nullptr dereference and crash upon accessing the object? 这是真的,“一般”(意思是,在x86机器,让我们说,在“通常”的复杂层次结构类间的转换)一个dynamic_cast不能被优化掉,并在事实上产生nullptr即使你*是正确的之后,导致nullptr取消引用并在访问对象时崩溃?

  • Is "always replace *dynamic_cast<T*>(ptr) with either dynamic_cast + test or assertion of some kind, or with *static_cast<T*>(ptr) " a sound advice? 是“总是用dynamic_cast + test或某种断言替换*dynamic_cast<T*>(ptr) ,或用*static_cast<T*>(ptr) ”一个合理的建议?

T& object = *dynamic_cast<T*>(ptr); is broken because it invokes UB on failure, period. 被破坏是因为它在失败期间调用UB。 I see no need to belabor the point. 我认为没有必要强调这一点。 Even if it seems to work on current compilers, it may not work on later versions with more aggressive optimizers. 即使它似乎适用于当前的编译器,它也可能不适用于具有更积极优化器的更高版本。

If you want checks and don't want to be bothered writing an assertion, use the reference form that throws bad_cast on failure: 如果你想要检查并且不想打扰写一个断言,请使用在失败时抛出bad_cast的引用表单:

T& object = dynamic_cast<T&>(*ptr);

dynamic_cast isn't just a run-time check. dynamic_cast不仅仅是一个运行时检查。 It can do things static_cast can't. 它可以做static_cast不能做的事情。 For example, it can cast sideways. 例如,它可以横向投射。

A   A (*)
|   |
B   C
\   /
 \ /
  D

If the actual most derived object is a D , and you have a pointer to the A base marked with * , you can actually dynamic_cast it to get a pointer to the B subobject: 如果实际最派生的对象是D ,并且你有一个指向带有*A base的指针,你实际上可以dynamic_cast它来获得一个指向B子对象的指针:

struct A { virtual ~A() = default; };
struct B : A {};
struct C : A {};
struct D : B, C {};
void f() {
    D d;
    C& c = d;
    A& a = c;
    assert(dynamic_cast<B*>(&a) != nullptr);
}

Note that a static_cast here would be completely wrong. 请注意,这里的static_cast是完全错误的。

(Another prominent example where dynamic_cast can do something static_cast can't is when you are casting from a virtual base to a derived class.) dynamic_cast可以执行某些操作的另一个突出示例static_cast不能在您从虚拟基类转换为派生类时。)

In a world without final or whole-program knowledge, you have to do the check at run time (because C and D may not be visible to you). 在没有final或整个程序知识的世界中,您必须在运行时进行检查(因为您可能看不到CD )。 With final on B , you should be able to get away with not doing it, but I'm not surprised if compilers haven't gotten around to optimizing that case yet. 对于B final ,你应该能够逃脱而不去做,但如果编译器还没有开始优化那个案例,我也不会感到惊讶。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM