简体   繁体   中英

How object slicing can result in memory corruption?

The C++ expert & D language creator Walter Bright says that:

The slicing problem is serious because it can result in memory corruption, and it is very difficult to guarantee a program does not suffer from it. To design it out of the language, classes that support inheritance should be accessible by reference only (not by value). The D programming language has this property.

It would be better if someone explain it by giving an C++ example where object slicing problem causes memory corruption? And how this problem is solved by D language?

Consider

class Account
{
    char *name = new char[16];

    public: virtual ~Account() { delete[] name; }
    public: virtual void sayHello() { std::cout << "Hello Base\n"; }

};

class BankAccount : public Account
{
    private: char *bankName = new char[16];
    public: virtual ~BankAccount() override { delete[] bankName; }
    public: virtual void sayHello() override { std::cout << "Hello Derived\n"; }

};

int main()
{
    BankAccount d;

    Account a1 = d; // slicing
    Account& a2 = d; // no slicing

    a1.sayHello(); // Hello Base
    a2.sayHello(); // Hello Derived

}

Here a1 will leak bankName when Account::~Account , instead of BankAccount::~BankAccount , runs because it has no way to invoke a polymorphic behavior. As why it is so specifically, it has been greatly explained here .

The following simple little C++ program with its output shows the slicing problem and why it can lead to memory corruption.

With languages like D and Java and C# the variables are accessed through a reference handle. This means that all of the information about the variable is associated with the reference handle. With C++ the information about the variable is part of the compiler state when the compile is being done. Turning on C++ Run-Time Type Information (RTTI) can provide a mechanism to look at object type at run time however it doesn't really help with the slicing problem.

Basically, C++ removes a safety net in order to squeeze out a bit more speed.

The C++ compiler has a set of rules that it uses so that if specific methods are not provided in a class, for instance copy constructor or assignment operator, the compiler will do its best to create its own, default version. The compiler also has rules it uses so that if a specific method is not available then it will look for an alternative way of creating the code that will express the meaning of a source statement.

Sometimes the compiler is too helpful and the result becomes dangerous.

In this example, there are two classes, levelOne is the base class and levelTwo is the derived class. It uses virtual destructors so that a pointer to an object of the base class will clean up the derived class part of the object as well.

In the output we see that the assignment of the derived class to the base class results in slicing and when the destructor is called, only the destructor for the base class is called and not the destructor of the derived class.

The result of the destructor of the derived class not being called means that any resources owned by the derived object may not be released properly.

Here is the simple program.

#include "stdafx.h"
#include <iostream>

class levelOne
{
public:
    levelOne(int i = 1) : iLevel(i) { iMyId = iId++; std::cout << "  levelOne construct  " << iMyId << std::endl; }
    virtual ~levelOne() { std::cout << "  levelOne destruct  " << iMyId << "  iLevel = " << iLevel << std::endl; }

    int  iLevel;
    int  iMyId;

    static int iId;
};

int levelOne::iId = 1;

class levelTwo : public levelOne
{
public:
    levelTwo(int i = 2) : levelOne(i) { jLevel = 2; iMyTwoId = iTwoId++;  std::cout << "     levelTwo construct  " << iMyId << ", " << iMyTwoId << std::endl; }
    virtual ~levelTwo() { std::cout << "     levelTwo destruct  " << iMyId << ", " << iMyTwoId << "  iLevel = " << iLevel << "  jLevel = " << jLevel << std::endl; }

    int  jLevel;
    int  iMyTwoId;

    static int iTwoId;
};

int levelTwo::iTwoId = 101;


int _tmain(int argc, _TCHAR* argv[])
{
    levelOne one;
    levelTwo two;

    std::cout << "Create LevelOne and assign to it a LevelTwo" << std::endl;
    levelOne aa;     // create a levelOne object
    aa = two;        // assign to the levelOne object a levelTwo object

    std::cout << "Create LevelTwo and assign to it a LevelOne pointer then delete it" << std::endl;
    levelOne *pOne = new levelTwo;
    delete pOne;

    std::cout << "Exit program." << std::endl;
    return 0;
}

The output shows that the object created with the pOne = new levelTwo; whose id is 4 hits both the levelTwo and the levelOne destructors properly handling the object destruction.

However the assignment of the levelTwo object two to the levelOne object aa results in slicing since the default assignment operator, which just does a memory copy, is used so that when the destructor of object aa is invoked, only the destructor of levelOne is performed meaning that any resources owned by the derived class will not be released.

Then the other two objects are destructed properly as they all go out of scope as the program ends. Reading this log remember that destructors are called in the reverse order of construction.

  levelOne construct  1
  levelOne construct  2
     levelTwo construct  2, 101
Create LevelOne and assign to it a LevelTwo
  levelOne construct  3
Create LevelTwo and assign to it a LevelOne pointer then delete it
  levelOne construct  4
     levelTwo construct  4, 102
     levelTwo destruct  4, 102  iLevel = 2  jLevel = 2
  levelOne destruct  4  iLevel = 2
Exit program.
  levelOne destruct  2  iLevel = 2
     levelTwo destruct  2, 101  iLevel = 2  jLevel = 2
  levelOne destruct  2  iLevel = 2
  levelOne destruct  1  iLevel = 1

An aspect of inheritance that is difficult to model well is that there are some cases where it's useful to say:

  1. A T should be assignable to a variable of type U .
  2. A *T should be assignable to a *U .
  3. A const *T should be assignable to a const *U .

but C++ makes no distinction among them. Java and C# avoid the issue by only offering the second semantics (it's not possible to have variables which hold class-object instances; while those languages don't use pointer notation, all class-type variables are implicitly references to objects stored elsewhere). In C++, however, there's no simple form of declaration which simply allows the second or third form without the first, nor is there any way to distinguish between "pointer to something which can be stored in a variable of type U " from "pointer to something which contains all the virtual and non-virtual members of a U ". It would be possible for a language's type system to make a distinction between "strict" and "non-strict" pointer types, and allow a virtual method of class U to specify that:

  1. It must be overridden by any type which cannot be stored in a variable of type U , and...

  2. Within the method, this should be of type as U strict * , and dereferencing a variable of type U strict * should yield an rvalue of type U strict , which should be assignable to one of type U even though an rvalue of type U would not be.

C++ offers no such distinction, however, which means that there is no way to distinguish between methods that require a pointer to something that can be stored in a variable of type U , versus those which require something that has the same members.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM