简体   繁体   English

segfault,但不在valgrind或gdb中

[英]segfault, but not in valgrind or gdb

In my project, there is a library that has code to load an fbx using the FBX SDK 2017.1 from Autodesk. 在我的项目中,有一个库包含使用Autodesk的FBX SDK 2017.1加载fbx的代码。

Loading the fbx crashes in debug & release. 加载fbx在调试和发布时崩溃。 The crash occurs in 2 different ways and what seems to be at random: 崩溃以两种不同的方式发生,并且似乎是随机发生的:

  • the crash is either simply "Segmentation fault" (most of the time) 崩溃仅仅是“分段错误”(大多数情况下)
  • the crash is a dump of all the libraries that may be involved in the crash, and the allusion of a problem with a realloc() call. 崩溃是崩溃中可能涉及的所有库的转储,也是对realloc()调用问题的暗示。 (every once in a while) From the context of the message, I haven't been able to make out which realloc that may be (the message is followed by a dump of all the libs that are linked). (每隔一段时间)从消息的上下文中,我无法确定可能是哪个重新分配(消息之后是所有链接的库的转储)。

The code does contain realloc() calls, specifically in the allocation of buffers used in a custom implementation of an FbxStream 该代码确实包含realloc()调用,特别是在FbxStream的自定义实现中使用的缓冲区分配中

Most of the code path is entirely identical for windows, only a number of platform specific sections have been re-implemented. Windows的大多数代码路径完全相同,仅重新实现了许多平台特定的部分。 On windows, it runs as expected. 在Windows上,它按预期运行。

What strikes me is that if I run the program in either gdb or valgrind, the crash disappears! 让我印象深刻的是,如果我在gdb或valgrind中运行该程序,崩溃将消失! So I set out to find uninitialized members/values, but so far I could not find anything suspicious. 因此,我着手查找未初始化的成员/值,但到目前为止,我还没有发现任何可疑的东西。 I used CppDepend/CppCheck and VS2012 code analysis, but both came up empty on un-initialized variables/members 我使用了CppDepend / CppCheck和VS2012代码分析,但是在未初始化的变量/成员上都空了

To give some background on FBX loading; 为FBX加载提供一些背景知识; the FBX SDK has a number of ways to deal with different types of resources (obj, 3ds, fbx,..). FBX SDK具有多种处理不同类型资源(obj,3ds,fbx等)的方法。 They can be loaded from file or from stream. 它们可以从文件或流中加载。 To support large files, the stream option is the more relevant option. 为了支持大文件,流选项是更相关的选项。 The code below is far from perfect, but what interests me mostly at present is the reason why valgrind/gdb would not crash. 下面的代码远非完美,但目前我最感兴趣的是valgrind / gdb不会崩溃的原因。 I've left the SDK documentation on top of ReadString, since it's the most complex one. 我将SDK文档放在ReadString之上,因为它是最复杂的。

class MyFbxStream : public FbxStream{
    uint32 m_FormatID;
    uint32 m_Error;
    EState m_State;
    size_t m_Pos;
    size_t m_Size;
    const Engine::Buffer* const m_Buffer;
    MyFbxStream& operator = (const MyFbxStream& other) const;
public:
    MyFbxStream(const Engine::Buffer* const buffer) 
    : m_FormatID(0)
    , m_Error(0)
    , m_State(eClosed)
    , m_Pos(0)
    , m_Size(0)
    , m_Buffer(buffer) {};
    virtual ~MyFbxStream() {};
    virtual bool Open(void* pStreamData) {
        m_FormatID = *(uint32*)pStreamData;
        m_Pos = 0;
        m_State = eOpen;
        m_Size = m_Buffer->GetSize();
        return true;
    }
    virtual bool Close() {
        m_Pos = m_Size = 0;
        m_State = eClosed;
        return true;
    }
    virtual int Read(void* pData, int pSize) const  {
        const unsigned char* data = (m_Buffer->GetBase(m_Pos));
        const size_t bytesRead = m_Pos + pSize > m_Buffer->GetSize() ? (m_Buffer->GetSize() - m_Pos) : pSize;
        const_cast<MyFbxStream*>(this)->m_Pos += bytesRead;
        memcpy(pData, data, bytesRead);
        return (int)bytesRead;
    }
    /** Read a string from the stream.
    * The default implementation is written in terms of Read() but does not cope with DOS line endings.
    * Subclasses may need to override this if DOS line endings are to be supported.
    * \param pBuffer Pointer to the memory block where the read bytes are stored.
    * \param pMaxSize Maximum number of bytes to be read from the stream.
    * \param pStopAtFirstWhiteSpace Stop reading when any whitespace is encountered. Otherwise read to end of line (like fgets()).
    * \return pBuffer, if successful, else NULL.
    * \remark The default implementation terminates the \e pBuffer with a null character and assumes there is enough room for it.
    * For example, a call with \e pMaxSize = 1 will fill \e pBuffer with the null character only. */
    virtual char* ReadString(char* pBuffer, int pMaxSize, bool pStopAtFirstWhiteSpace = false) {
        assert(!pStopAtFirstWhiteSpace); // "Not supported"
        const size_t pSize = pMaxSize - 1;
        if (pSize) {
            const char* const base = (const char* const)m_Buffer->GetBase();
            char* cBuffer = pBuffer;
            const size_t totalSize = std::min(m_Buffer->GetSize(), (m_Pos + pSize));
            const char* const maxSize = base + totalSize;
            const char* sum = base + m_Pos;
            bool done = false;
            // first align the copy on alignment boundary (4byte)
            while ((((size_t)sum & 0x3) != 0) && (sum < maxSize)) {
                const unsigned char c = *sum++;
                *cBuffer++ = c;
                if ((c == '\n') || (c == '\r')) {
                    done = true;
                    break;
            }   }
            // copy from alignment boundary to boundary (4byte)
            if (!done) {
                int64 newBytesRead = 0;
                uint32* dBuffer = (uint32*)cBuffer;
                const uint32* dBase = (uint32*)sum;
                const uint32* const dmaxSize = ((uint32*)maxSize) - 1;
                while (dBase < dmaxSize) {
                    const uint32 data = *(const uint32*const)dBase++;
                    *dBuffer++ = data;
                    if (((data & 0xff) == 0x0a) || ((data & 0xff) == 0x0d)) { // third bytes, 4 bytes read..
                        newBytesRead -= 3;
                        done = true;
                        break;
                    } else {
                        const uint32 shiftedData8 = data & 0xff00;
                        if ((shiftedData8 == 0x0a00) || (shiftedData8 == 0x0d00)) { // third bytes, 3 bytes read..
                            newBytesRead -= 2;
                            done = true;
                            break;
                        } else {
                            const uint32 shiftedData16 = data & 0xff0000;
                            if ((shiftedData16 == 0x0a0000) || (shiftedData16 == 0x0d0000)) { // second byte, 2 bytes read..
                                newBytesRead -= 1;
                                done = true;
                                break;
                            } else {
                                const uint32 shiftedData24 = data & 0xff000000;
                                if ((shiftedData24 == 0x0a000000) || (shiftedData24 == 0x0d000000)) { // first byte, 1 bytes read..
                                    done = true;
                                    break;
                }   }   }   }   }
                newBytesRead += (int64)dBuffer - (int64)cBuffer;
                if (newBytesRead) {
                    sum += newBytesRead;
                    cBuffer += newBytesRead;
            }   }
            // copy anything beyond the last alignment boundary (4byte)
            if (!done) {
                while (sum < maxSize) {                 
                    const unsigned char c = *sum++;
                    *cBuffer++ = c;
                    if ((c == '\n') || (c == '\r')) {
                        done = true;
                        break;
            }   }   }
            const size_t bytesRead = cBuffer - pBuffer;
            if (bytesRead) {
                const_cast<MyFbxStream*>(this)->m_Pos += bytesRead;
                pBuffer[bytesRead] = 0;
                return pBuffer;
        }   }       
        pBuffer = NULL;
        return NULL;
    }
    virtual void Seek(const FbxInt64& pOffset, const FbxFile::ESeekPos& pSeekPos) {
        switch (pSeekPos) {
            case FbxFile::ESeekPos::eBegin:     m_Pos = pOffset; break;
            case FbxFile::ESeekPos::eCurrent:   m_Pos += pOffset; break;
            case FbxFile::ESeekPos::eEnd:       m_Pos = m_Size - pOffset; break;
        }
    }
    virtual long GetPosition() const        {   return (long)m_Pos; }
    virtual void SetPosition(long position) {   m_Pos = position;   }
    virtual void ClearError()               {   m_Error = 0;    }
    virtual int GetError() const            {   return m_Error; }
    virtual EState GetState()               {   return m_State; }
    virtual int GetReaderID() const         {   return m_FormatID;  }
    virtual int GetWriterID() const         {   return -1;  }                       // readonly stream
    virtual bool Flush()                    {   return true;    }                   // readonly stream
    virtual int Write(const void* /*d*/, int /*s*/) {   assert(false);  return 0; } // readonly stream
};

I assume that there may be undefined behavior related to malloc/free/realloc operations that somehow do not occur in gdb. 我假设可能存在与galloc中不发生的与malloc / free / realloc操作相关的未定义行为。 But if this is the case, I also expect the Windows binaries to have problems. 但是,如果是这种情况,我也希望Windows二进制文件有问题。

Also, I don't know if this is relevant, but the when I trace into the Open() function and print the "m_Buffer" pointer's value (or the "this"), I get a pointer value starting with 0xfffffff.. which for a Windows programmer looks like a problem. 另外,我不知道这是否相关,但是当我跟踪Open()函数并打印“ m_Buffer”指针的值(或“ this”)时,我得到了一个以0xfffffff开头的指针值。对于Windows程序员来说似乎是个问题。 However, can I pull the same conclusion in linux, since I also saw this happening in static function calls etc. 但是,我可以在linux中得出相同的结论,因为我也看到这种情况发生在静态函数调用等中。

if I run the program in either gdb or valgrind, the crash disappears! 如果我在gdb或valgrind中运行该程序,崩溃将消失!

There are two possible explanations: 有两种可能的解释:

  1. There are multiple threads, the code exhibits a data race, and both GDB and Valgrind significantly affect execution timing. 有多个线程,代码表现出数据竞争,并且GDB和Valgrind都显着影响执行时间。
  2. GDB disables address randomization; GDB禁用地址随机化; Valgrind significantly affects program layout, and the crash is sensitive to the exact layout. Valgrind会严重影响程序布局,并且崩溃对确切的布局很敏感。

The steps I would take: 我将采取的步骤:

  1. Set ulimit -c unlimited , run the program and get it to dump core , then use post-mortem analysis in GDB. 设置ulimit -c unlimited ,运行程序并将其转储为core ,然后在GDB中使用事后分析。
  2. Run the program under GDB, use set disable-randomization off and see if you can get to crash point that way. 在GDB下运行程序,使用set disable-randomization off ,看看是否可以通过这种方式崩溃。
  3. Run the program with Helgrind or DRD , Valgrind's thread error detectors. 使用Helgrind或Valgrind的线程错误检测器DRD运行该程序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM