简体   繁体   中英

Under what conditions does MSVC C++ Compiler sometimes write the array size directly before the pointer returned from function operator new[]?

I'm currently working on a memory tracker for work, and we are overloading the function operator new[], in its many variations. While writing some unit tests, I stumbled across the fact that MSVC C++ 2019 (using the ISO C++ 17 Standard(std:c++17) compiler setting), writes the size of the allocated array of objects directly before the pointer returned to the caller, but only sometimes. I have been unable to find any documented conditions under which this will occur. Can anyone please explain what those conditions are, how I can detect them at runtime, and or point me to any documentation?

To even determine this was happening, I had to disassemble the code. Here is the C++:

const size_t k_NumFoos = 6;
Foo* pFoo = new Foo[k_NumFoos];

And here is the disassembly:

00007FF747BB3683  call        operator new[] (07FF747A00946h)  
00007FF747BB3688  mov         qword ptr [rbp+19E8h],rax  
00007FF747BB368F  cmp         qword ptr [rbp+19E8h],0  
00007FF747BB3697  je          ____C_A_T_C_H____T_E_S_T____0+0FF7h (07FF747BB36F7h)  
00007FF747BB3699  mov         rax,qword ptr [rbp+19E8h]  
00007FF747BB36A0  mov         qword ptr [rax],6  
00007FF747BB36A7  mov         rax,qword ptr [rbp+19E8h]  
00007FF747BB36AE  add         rax,8  
00007FF747BB36B2  mov         qword ptr [rbp+1B58h],rax  

The cmp and je lines are from the Catch2 library we are using for our unit tests. The subsequent two mov s, following the je , are where it's writing the array size. The next three lines ( mov , add , mov ) are where it's moving the pointer to after where it has written the array size. This is all well and good, mostly.

We are also using MS's VirtualAlloc as the allocator internal to the overloaded function operator new[]. The address returned from VirtualAlloc must be aligned for the function operator new[] that uses std::align_t , and when the alignment is greater than the default max alignment, the moving of the pointer in those last three lines of disassembly are messing with the aligned address being returned. Initially, I thought all allocations made with function operator new[] would have this behavior. So, I tested some other uses of function operator new[], and found it to be true in all cases I tested. I wrote the code to adjust for this behavior, and then ran into a case where it doesn't exhibit the behavior of writing the array size before the returned allocation.

Here is the C++ of where it is not writing the array size before the returned allocation:

char **utf8Argv = new char *[ argc ];

argc is equal to 1. The line comes from the Session::applyCommandLine method in the Catch2 library. The disassembly looks like so:

00007FF73E189C6A  call        operator new[] (07FF73E07D6D8h)  
00007FF73E189C6F  mov         qword ptr [rbp+168h],rax  
00007FF73E189C76  mov         rax,qword ptr [rbp+168h]  
00007FF73E189C7D  mov         qword ptr [utf8Argv],rax  

Notice after the call to operator new[] (07FF73E07D6F8h) there is no writing of the array size. When looking at the two for differences, I can see that one writes to a pointer, while the other writes to a pointer to a pointer. However, none of that information is available internally, at runtime, to function operator new[], as far as I know.

The code here comes from a Debug | x64 build. Any ideas on how to determine when this behavior will occur?

Update (for convo below): Class Foo:

template<size_t ArrLen>
class TFoo
{
public:
    TFoo()
    {
        memset(m_bar, 0, ArrLen);
    }
    TFoo(const TFoo<ArrLen>& other)
    {
        strncpy_s(m_bar, other.m_bar, ArrLen);
    }
    TFoo(TFoo<ArrLen>&& victim)
    {
        strncpy_s(m_bar, victim.m_bar, ArrLen);
    }
    ~TFoo()
    {
    }
    TFoo<ArrLen>& operator= (const TFoo<ArrLen>& other)
    {
        strncpy_s(m_bar, other.m_bar, ArrLen);
    }
    TFoo<ArrLen>& operator= (TFoo<ArrLen>&& victim)
    {
        strncpy_s(m_bar, victim.m_bar, ArrLen);
    }

    const char* GetBar()
    {
        return m_bar;
    }
    void SetBar(const char bar[ArrLen])
    {
        strncpy_s(m_bar, bar, ArrLen);
    }

protected:
    char m_bar[ArrLen];
};
using Foo = TFoo<8>;

At a guess, I would think the compiler would write the number of objects allocated out before the pointer returned to you when it is allocating objects which have a destructor that needs to be called when you call delete [] . Under those circumstances, the compiler has to emit code to destroy each of the objects allocated when you call delete [] , and to do that, it needs to know how many objects are present in the array.

OTOH, for something like char * , no count is needed, and so, as a minor optimisation, none is emitted, or so it would seem.

I don't suppose you'll find this documented anywhere and the behaviour might change in future versions of the compiler. It doesn't seem to be part of the standard.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM