简体   繁体   中英

How does C++ compiler optimize stack allocation?

I find this post and write some tests like this:

I am expecting compiler makes a TCO on foo3 , that destroys sp first and invokes func with a simple jump that would not create stack frame. But it is not happening. The program runs into func at (assembly code) line 47 with a call and clean sp object after that. The optimization will not happen even I clear ~Simple() .

So, how can I trigger TCO in this case?

First, note that the example has a double-free bug. If the move-constructor is called, sp.buffer is not set to nullptr as it must be, so two pointers to the buffer now exist to be later deleted. A simpler version which manages the pointer correctly is:

struct Simple {
  std::unique_ptr<int[]> buffer {new int[1000]};
};

With that fix, let's inline almost everything and see what foo3 really does in all its glory:

using func_t = std::function<int(Sample&&)>&&;
int foo3(func_t func) {
  int* buffer1 = new int[1000]; // the unused local
  int* buffer2 = new int[1000]; // the call argument
  if (!func) {
    delete[] buffer2;
    delete[] buffer1;
    throw bad_function_call;
  }
  try {
    int retval = func(buffer2); // <-- the call
  } catch (...) {
    delete[] buffer2;
    delete[] buffer1;
    throw;
  }
  delete[] buffer2;
  delete[] buffer1;
  return retval;              // <-- the return
}

The case of buffer1 is straightforward. It is an unused local, and the only side effects are allocation and deallocation, which compilers are allowed to skip. An intelligent enough compiler could completely remove the unused local. clang++ 5.0 seems to accomplish this, but g++ 7.2 does not.

More interesting is buffer2 . func takes a non-const rvalue reference. It can modify the argument. For example, it may move from it. But it may not. The temporary may still own a buffer which must be deleted after the call and foo3 must do that. The call is not a tail call.

As observed, we get closer to a tail call by simply leaking the buffer:

struct Simple {
    int* buffer = new int[1000];
};

That's cheating a bit because a big part of the question is about tail call optimization in the face of nontrivial destructors. But let's entertain this. As observed, this alone does not result in a tail call.

To start with, note that passing by reference is a fancy form of passing by pointer. The object still must exist somewhere, and that's on the stack in the caller. Needing to keep the caller's stack alive and nonempty during the call will rule out tail call optimization.

To enable a tail call, we want to pass func 's arguments in registers, so it doesn't have to live in foo3 's stack. This suggests we should pass by value:

int foo2(Simple); // etc.

The SysV ABI dictates that to be passed in a register, it needs to be trivially copyable, movable, and destructible. Being a struct wrapping an int* , we have that covered. Fun fact: we can't use a std::unique_ptr here with a no-op deleter because that is not trivially destructible.

Even so, we still don't see a tail call. I do not see a reason preventing it, but I'm not an expert. Replacing the std::function with a function pointer does result in a tail call. The std::function has one extra argument in the call and has a conditional throw. Is it possible those make it difficult enough to optimize?

Anyways, with a function pointer, g++ 7.2 and clang++ 5.0 do the tail call:

struct Simple {
  int* buffer = new int[1000];
};

int foo2(Simple sp) {
  return sp.buffer[std::rand()];
}

using func_t = int (*)(Simple);
int foo3(func_t func) {
  return func(Simple());
}

But this is leaky. Can we do better? There is ownership underlying this type, and we want to pass it from foo3 to func . But types with nontrivial destructors cannot be passed in arguments. This means an RAII type like std::unique_ptr will not get us there. Using a concept from the GSL, we can at least express the ownership:

template<class T> using owner = T;
struct Simple {
  owner<int*> buffer = new int[1000];
};

Then we can hope that static analysis tools now or in the future can detect that foo2 is accepting ownership but never deleting buffer .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM