简体   繁体   中英

how can i debug a C++ delete call on android NDK?

I've a Android C++ application that is manged by the Java layer. In this code i use a old physics library (tokamak) and i do almost nothing, i create and delete the simulator like this:

static neSimulator *gSim;
neV3 gravity; gravity.Set(0.0f, -10.f, 0.0f);
neSimulatorSizeInfo sizeInfo;
sizeInfo.rigidBodiesCount = 1;
sizeInfo.animatedBodiesCount = 1;
sizeInfo.geometriesCount = 2;
sizeInfo.overlappedPairsCount = 2;
gSim = neSimulator::CreateSimulator(sizeInfo, NULL, &gravity);

And the destroy it:

neSimulator::CreateSimulator(gSim);

This works, the problem appears when i start adding geometry:

neV3 ballPos;
rgdBall = gSim->CreateRigidBody();
neGeometry *geoBall = rgdBall->AddGeometry();
geoBall->SetSphereDiameter(1.5f);
rgdBall->UpdateBoundingInfo();
rgdBall->SetMass(2.0f);
rgdBall->SetInertiaTensor(neSphereInertiaTensor(1.5f, 2.0f));
ballPos.Set(0.0f, 5.0f, 0.0f);
rgdBall->SetPos(ballPos);

In this case when i call the destroy (and i only call it once) i get a SIGSEGV (Null Pointer) deadbaad.

I've all debugging log statements to the destructor method and the code inside the destructor completes to the end. So there is this code:

void neSimulator::DestroySimulator(neSimulator * sim)
{
    __android_log_print(ANDROID_LOG_INFO, "TOKAMAK", "Before cast");
    neFixedTimeStepSimulator * s = reinterpret_cast<neFixedTimeStepSimulator *>(sim);
    __android_log_print(ANDROID_LOG_INFO, "TOKAMAK", "After cast");
    __android_log_print(ANDROID_LOG_INFO, "TOKAMAK", "Before delete");
    delete s;
    __android_log_print(ANDROID_LOG_INFO, "TOKAMAK", "After delete");
}

So i log the destructor:

neFixedTimeStepSimulator::~neFixedTimeStepSimulator()
{
    FreeAllBodies();

    if (perf)
        delete perf;
        __android_log_print(ANDROID_LOG_INFO, "TOKAMAK", "dtor complete");
}

What is messing me is that i see the dtor complete message on the log but not the After delete message and a SIGSEGV error.

How can i investigate it better?

[More info after further investigation]

So i used the addr2line tool to investigate the stack trace and traced the error to the default memory allocator. So i added logging to all alloc and free calls:

03-23 13:31:14.617: INFO/neAllocatorDefault(326): malloc 0x1b3fd8 size 2292
03-23 13:31:14.617: INFO/neAllocatorDefault(326): malloc 0x1b48d0 size 488
03-23 13:31:14.627: INFO/neAllocatorDefault(326): malloc 0x44ae3008 size 114404
03-23 13:31:14.627: INFO/neAllocatorDefault(326): malloc 0x1a58b8 size 8
03-23 13:31:14.627: INFO/neAllocatorDefault(326): malloc 0x1b4ac0 size 800
03-23 13:31:14.627: INFO/neAllocatorDefault(326): malloc 0x1b4de8 size 416
03-23 13:31:14.627: INFO/neAllocatorDefault(326): malloc 0x1b4f90 size 836
03-23 13:31:14.627: INFO/neAllocatorDefault(326): malloc 0x1aca10 size 44
03-23 13:31:14.627: INFO/neAllocatorDefault(326): malloc 0x1b52d8 size 2500
03-23 13:31:14.627: INFO/neAllocatorDefault(326): malloc 0x1b5ca0 size 2500
03-23 13:31:14.627: INFO/neAllocatorDefault(326): malloc 0x1b6668 size 2500
03-23 13:31:14.637: INFO/neAllocatorDefault(326): malloc 0x1b7030 size 400
03-23 13:31:14.637: INFO/neAllocatorDefault(326): malloc 0x1b71c8 size 800
03-23 13:31:14.637: INFO/neAllocatorDefault(326): malloc 0x424ed008 size 72404
03-23 13:31:14.637: INFO/neAllocatorDefault(326): malloc 0x1b74f0 size 4004
03-23 13:31:14.637: INFO/neAllocatorDefault(326): malloc 0x1b8498 size 2044
03-23 13:31:14.637: INFO/neAllocatorDefault(326): malloc 0x1b8c98 size 6044
03-23 13:31:14.637: INFO/neAllocatorDefault(326): malloc 0x1ba438 size 5004
03-23 13:31:14.637: INFO/neAllocatorDefault(326): malloc 0x1bb7c8 size 11204
03-23 13:31:14.637: INFO/neAllocatorDefault(326): malloc 0x1be390 size 340
03-23 13:31:14.637: INFO/neAllocatorDefault(326): malloc 0x1be4e8 size 4000
03-23 13:31:14.647: INFO/neAllocatorDefault(326): malloc 0x1bf490 size 4000
03-23 13:31:14.647: INFO/neAllocatorDefault(326): malloc 0x1c0438 size 38800
03-23 13:31:14.647: INFO/neAllocatorDefault(326): malloc 0x1c9bd0 size 38800

And

03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1b71c8
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1b7030
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1b6668
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1b5ca0
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1b52d8
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1aca10
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1b4f90
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1b4de8
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1b4ac0
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1a58b8
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x44ae3008
03-23 13:31:19.508: INFO/neAllocatorDefault(326): free 0x1b48d0

So the SIGSEGV happens when trying to free 0x1b48d0, the funny thing is that there is a previous malloc that returned that pointer and no previous free. I am even more puzzled now...

I'd suspect the reinterpret_cast in DestroySimulator to be wrong. The object probably gets deleted using different type than it actually is, which invokes the wrong destructor, which in turn damages the allocator metadata and causes free to crash.

There are two important things to observe when deleting casted objects:

  1. The destructor to be called is selected based on static type of the expression in delete . So the type has to be the same as the type passed to new, or only if the object has virtual destructor, may be a base type of the type passed to new.
  2. Casting between pointer to derived class and pointer to it's base class may not be numerically equivalent. The static_cast knows how to adjust the pointer, but by using reinterpret_cast you explicitly tell the compiler to not adjust it, so the pointer may be wrong (it should only happen if multiple inheritance is involved, but it may be somewhere deep in the library).

Using reinterpret_cast in C++ is almost always wrong.

Found the problem. Tokamak uses a special allocation method to avoid exception (i think). For each new object instantiation it does something like:

// assume you have a class MyObject
// and the default allocator is the C malloc/free functions:
MyObject *obj = new (malloc(sizeof(MyObject))) MyObject;
... do something with the obj ...
free(obj);

So that works just fine for object the problem is with arrays. With arrays, Tokamak code was doing:

MyObject[] *obj = new (malloc(sizeof(MyObject) * elements + 4)) MyObject[];
... do something with the obj ...
free(obj); // Kaboom!!!

The thing here seems to be that ndk compiler uses more than 4bytes (32bit int) for array indexes, if i expand the code to be +8 then everything just works.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM