简体   繁体   中英

Undefined behaviour with const_cast

I was hoping that someone could clarify exactly what is meant by undefined behaviour in C++. Given the following class definition:

class Foo
{
public:
    explicit Foo(int Value): m_Int(Value) { }
    void SetValue(int Value) { m_Int = Value; }

private:
    Foo(const Foo& rhs);
    const Foo& operator=(const Foo& rhs);

private:
    int m_Int;
};

If I've understood correctly the two const_casts to both a reference and a pointer in the following code will remove the const-ness of the original object of type Foo, but any attempts made to modify this object through either the pointer or the reference will result in undefined behaviour.

int main()
{
    const Foo MyConstFoo(0);
    Foo& rFoo = const_cast<Foo&>(MyConstFoo);
    Foo* pFoo = const_cast<Foo*>(&MyConstFoo);

    //MyConstFoo.SetValue(1);   //Error as MyConstFoo is const
    rFoo.SetValue(2);           //Undefined behaviour
    pFoo->SetValue(3);          //Undefined behaviour

    return 0;
}

What is puzzling me is why this appears to work and will modify the original const object but doesn't even prompt me with a warning to notify me that this behaviour is undefined. I know that const_casts are, broadly speaking, frowned upon, but I can imagine a case where lack of awareness that C-style cast can result in a const_cast being made could occur without being noticed, for example:

Foo& rAnotherFoo = (Foo&)MyConstFoo;
Foo* pAnotherFoo = (Foo*)&MyConstFoo;

rAnotherFoo->SetValue(4);
pAnotherFoo->SetValue(5);

In what circumstances might this behaviour cause a fatal runtime error? Is there some compiler setting that I can set to warn me of this (potentially) dangerous behaviour?

NB: I use MSVC2008.

I was hoping that someone could clarify exactly what is meant by undefined behaviour in C++.

Technically, "Undefined Behaviour" means that the language defines no semantics for doing such a thing.

In practice, this usually means " don't do it ; it can break when your compiler performs optimisations, or for other reasons".

What is puzzling me is why this appears to work and will modify the original const object but doesn't even prompt me with a warning to notify me that this behaviour is undefined.

In this specific example, attempting to modify any non-mutable object may "appear to work", or it may overwrite memory that doesn't belong to the program or that belongs to [part of] some other object, because the non-mutable object might have been optimised away at compile-time, or it may exist in some read-only data segment in memory.

The factors that may lead to these things happening are simply too complex to list. Consider the case of dereferencing an uninitialised pointer (also UB): the "object" you're then working with will have some arbitrary memory address that depends on whatever value happened to be in memory at the pointer's location; that "value" is potentially dependent on previous program invocations, previous work in the same program, storage of user-provided input etc. It's simply not feasible to try to rationalise the possible outcomes of invoking Undefined Behaviour so, again, we usually don't bother and instead just say " don't do it ".

What is puzzling me is why this appears to work and will modify the original const object but doesn't even prompt me with a warning to notify me that this behaviour is undefined.

As a further complication, compilers are not required to diagnose (emit warnings/errors) for Undefined Behaviour, because code that invokes Undefined Behaviour is not the same as code that is ill-formed (ie explicitly illegal). In many cases, it's not tractible for the compiler to even detect UB, so this is an area where it is the programmer's responsibility to write the code properly.

The type system — including the existence and semantics of the const keyword — presents basic protection against writing code that will break; a C++ programmer should always remain aware that subverting this system — eg by hacking away const ness — is done at your own risk, and is generally A Bad Idea.™

I can imagine a case where lack of awareness that C-style cast can result in a const_cast being made could occur without being noticed.

Absolutely. With warning levels set high enough, a sane compiler may choose to warn you about this, but it doesn't have to and it may not. In general, this is a good reason why C-style casts are frowned upon, but they are still supported for backwards compatibility with C. It's just one of those unfortunate things.

Undefined behaviour depends on the way the object was born , you can see Stephan explaining it at around 00:10:00 but essentially, follow the code below:

void f(int const &arg)
{
    int &danger( const_cast<int&>(arg); 
    danger = 23; // When is this UB?
}

Now there are two cases for calling f

int K(1);
f(k); // OK
const int AK(1); 
f(AK); // triggers undefined behaviour

To sum up, K was born a non const, so the cast is ok when calling f, whereas AK was born a const so ... UB it is.

Undefined behaviour literally means just that: behaviour which is not defined by the language standard. It typically occurs in situations where the code is doing something wrong, but the error can't be detected by the compiler. The only way to catch the error would be to introduce a run-time test - which would hurt performance. So instead, the language specification tells you that you mustn't do certain things and, if you do, then anything could happen.

In the case of writing to a constant object, using const_cast to subvert the compile-time checks, there are three likely scenarios:

  • it is treated just like a non-constant object, and writing to it modifies it;
  • it is placed in write-protected memory, and writing to it causes a protection fault;
  • it is replaced (during optimisation) by constant values embedded in the compiled code, so after writing to it, it will still have its initial value.

In your test, you ended up in the first scenario - the object was (almost certainly) created on the stack, which is not write protected. You may find that you get the second scenario if the object is static, and the third if you enable more optimisation.

In general, the compiler can't diagnose this error - there is no way to tell (except in very simple examples like yours) whether the target of a reference or pointer is constant or not. It's up to you to make sure that you only use const_cast when you can guarantee that it's safe - either when the object isn't constant, or when you're not actually going to modify it anyway.

What is puzzling me is why this appears to work

That is what undefined behavior means.
It can do anything including appear to work.
If you increase your optimization level to its top value it will probably stop working.

but doesn't even prompt me with a warning to notify me that this behaviour is undefined.

At the point it were it does the modification the object is not const. In the general case it can not tell that the object was originally a const, therefore it is not possible to warn you. Even if it was each statement is evaluated on its own without reference to the others (when looking at that kind of warning generation).

Secondly by using cast you are telling the compiler "I know what I am doing override all your safety features and just do it" .

For example the following works just fine: (or will seem too (in the nasal deamon type of way))

float aFloat;

int& anIntRef = (int&)aFloat;  // I know what I am doing ignore the fact that this is sensable
int* anIntPtr = (int*)&aFloat;

anIntRef  = 12;
*anIntPtr = 13;

I know that const_casts are, broadly speaking, frowned upon

That is the wrong way to look at them. They are a way of documenting in the code that you are doing something strange that needs to be validated by smart people (as the compiler will obey the cast without question). The reason you need a smart person to validate is that it can lead to undefined behavior, but the good thing you have now explicitly documented this in your code (and people will definitely look closely at what you have done).

but I can imagine a case where lack of awareness that C-style cast can result in a const_cast being made could occur without being noticed, for example:

In C++ there is no need to use a C style cast.
In the worst case the C-Style cast can be replaced by reinterpret_cast<> but when porting code you want to see if you could have used static_cast<>. The point of the C++ casts is to make them stand out so you can see them and at a glance spot the difference between the dangerous casts the benign casts.

一个典型的例子是尝试修改一个const字符串文字,它可能存在于受保护的数据段中。

出于优化原因,编译器可以将const数据放在存储器的只读部分中,并且尝试修改该数据将导致UB。

Static and const data are often stored in another part of you program than local variables. For const variables, these areas are often in read-only mode to enforce the constness of the variables. Attempting to write in a read-only memory results in an "undefined behavior" because the reaction depends on your operating system. "Undefined beheavior" means that the language doesn't specify how this case is to be handled.

If you want a more detailed explanation about memory, I suggest you read this . It's an explanation based on UNIX but similar mecanism are used on all OS.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM