[英]Does shared pointer break tail call optimization?
I'm practicing C++ and trying to implement immutable list. 我正在练习C ++并尝试实现不可变列表。 In one of my tests I'm trying to create a list with lot of values (1 million nodes) recursively.
在我的一个测试中,我试图以递归方式创建一个包含大量值(100万个节点)的列表。 All values are
const
, so I cannot perform regular loop, also this isn't functional enough, you know. 所有值都是
const
,所以我不能执行常规循环,这也不够功能 ,你知道。 The test fails with Segmentation fault
. 测试失败,出现
Segmentation fault
。
My system is 64-bit Xubuntu 16.04 LTS with Linux 4.4. 我的系统是带有Linux 4.4的64位Xubuntu 16.04 LTS。 I compile the my code with g++ 5.4 and clang++ 3.8 using
--std=c++14 -O3
flags. 我用g ++ 5.4和clang ++ 3.8使用
--std=c++14 -O3
标志编译我的代码。
I've written a simple example, which shows the situation, when tail call should be easily optimized, but something goes wrong and Segmentation fault
appears. 我写了一个简单的例子,它显示了尾部调用应该很容易优化的情况,但出现问题并出现
Segmentation fault
。 The function f
just waits amount
iterations and then creates a pointer to single int
and returns it 函数
f
只是等待amount
的迭代,然后创建一个指向单个int
,并返回它
#include <memory>
using std::shared_ptr;
shared_ptr<int> f(unsigned amount) {
return amount? f(amount - 1) : shared_ptr<int>{new int};
}
int main() {
return f(1E6) != nullptr;
}
Note this example fails only with g++
, while clang++
makes it okay. 请注意,此示例仅使用
g++
失败,而clang++
使其正常。 Though, on more complicated example it doesn't optimize too. 虽然,在更复杂的例子中它也没有优化。
Here is an example of a simple list with recursive insertion of elements. 下面是一个带有递归插入元素的简单列表的示例。 Also I've added
destroy
function, which helps to avoid stack overflow during destruction. 此外,我添加了
destroy
函数,这有助于避免在销毁期间堆栈溢出。 Here I get Segmentation fault
with both compilers 在这里,我得到两个编译器的
Segmentation fault
#include <memory>
using std::shared_ptr;
struct L {
shared_ptr<L> tail;
L(const L&) = delete;
L() = delete;
};
shared_ptr<L> insertBulk(unsigned amount, const shared_ptr<L>& tail) {
return amount? insertBulk(amount - 1, shared_ptr<L>{new L{tail}})
: tail;
}
void destroy(shared_ptr<L> list) {
if (!list) return;
shared_ptr<L> tail = list->tail;
list.reset();
for (; tail; tail = tail->tail);
}
int main() {
shared_ptr<L> list = shared_ptr<L>{new L{nullptr}};
destroy(insertBulk(1E6, list));
return 0;
}
NOTE Implementation with regular pointers is optimized well by both compilers. 注意两个编译器都很好地优化了常规指针的实现。
Does shared_ptr
really break tail call optimization in my case? 在我的情况下,
shared_ptr
真的打破了尾部调用优化吗? Is it a compilers' issue or a problem with shared_ptr
implementation? 它是编译器的问题还是
shared_ptr
实现的问题?
Short answer is: yes and no. 简短的回答是:是和否。
Shared pointer in C++ doesn't break tail call optimization, but it complicates creation of such recursive function, that can be converted to loop by compiler. C ++中的共享指针不会破坏尾部调用优化,但它会使这种递归函数的创建变得复杂,可以通过编译器将其转换为循环。
I've recalled that shared_ptr
has a destructor and C++ has RAII. 我记得
shared_ptr
有一个析构函数,而C ++有RAII。 This makes construction of optimizable tail call harder, as it was discussed in Can Tail Call Optimization and RAII Co-Exist? 这使得优化尾部调用的构建更加困难,正如Can Tail Call Optimization和RAII Co-Exist中所讨论的那样? question.
题。
@KennyOstrom has proposed to use an ordinary pointer to solve this problem @KennyOstrom建议使用普通指针来解决这个问题
static const List* insertBulk_(unsigned amount, const List* tail=nullptr) {
return amount? insertBulk_(amount - 1, new List{tail})
: tail;
}
The following constructor is used 使用以下构造函数
List(const List* tail): tail{tail} {}
When tail
of List
is an instance of shared_ptr
, tail call is successfully optimized. 当
tail
的List
是一个实例shared_ptr
,尾调用成功优化。
Custom destruction strategy is needed. 需要定制销毁策略。 Fortunately,
shared_ptr
allows us to set it, so I've hidden destructor of List
by making it private
, and use this for list destruction 幸运的是,
shared_ptr
允许我们设置它,所以我已经隐藏的析构函数List
通过使private
,并用这个列表毁灭
static void destroy(const List* list) {
if (!list) return;
shared_ptr<const List> tail = list->tail;
delete list;
for (; tail && tail.use_count() == 1; tail = tail->tail);
}
Constructor should pass this destruction function to tail
initialization list 构造函数应该将此销毁函数传递给
tail
初始化列表
List(const List* tail): tail{tail, List::destroy} {}
In the case of exceptions I'll not have proper cleanup, so the problem's not solved yet. 在例外的情况下,我没有适当的清理,所以问题还没有解决。 I want to use
shared_ptr
because it's safe, but now I don't use it for current list head until the end of construction. 我想使用
shared_ptr
因为它是安全的,但现在我不会将它用于当前列表头,直到构造结束。
It's needed to watch the "naked" pointer until it's wrapped into shared pointer, and free it in the case of emergency. 它需要观察“裸”指针,直到它被包装成共享指针,并在紧急情况下释放它。 Let's pass a reference to tail pointer instead of a pointer itself to
insertBulk_
. 让我们将尾指针的引用传递给
insertBulk_
而不是指针本身。 This will allow the last good pointer to be visible outside of the function 这将允许最后一个好指针在函数外部可见
static const List* insertBulk_(unsigned amount, const List*& tail) {
if (!amount) {
const List* result = tail;
tail = nullptr;
return result;
}
return insertBulk_(amount - 1, tail = new List{tail});
}
Then analogue of Finally
is needed in order to automate destruction of a pointer, which will leak in the case of exception 然后需要模拟
Finally
,以便自动销毁指针,在异常情况下会泄漏
static const shared_ptr<const List> insertBulk(unsigned amount) {
struct TailGuard {
const List* ptr;
~TailGuard() {
List::destroy(this->ptr);
}
} guard{};
const List* result = insertBulk_(amount, guard.ptr);
return amount? shared_ptr<const List>{result, List::destroy}
: nullptr;
}
Now, I guess, the problem is solved: 现在,我猜,问题解决了:
g++
and clang++
successfully optimize recursive creation of long lists; g++
和clang++
成功优化了长列表的递归创建; shared_ptr
; shared_ptr
; The final code is 最终的代码是
#include <memory>
#include <cassert>
using std::shared_ptr;
class List {
private:
const shared_ptr<const List> tail;
/**
* I need a `tail` to be an instance of `shared_ptr`.
* Separate `List` constructor was created for this purpose.
* It gets a regular pointer to `tail` and wraps it
* into shared pointer.
*
* The `tail` is a reference to pointer,
* because `insertBulk`, which called `insertBulk_`,
* should have an ability to free memory
* in the case of `insertBulk_` fail
* to avoid memory leak.
*/
static const List* insertBulk_(unsigned amount, const List*& tail) {
if (!amount) {
const List* result = tail;
tail = nullptr;
return result;
}
return insertBulk_(amount - 1, tail = new List{tail});
}
unsigned size_(unsigned acc=1) const {
return this->tail? this->tail->size_(acc + 1) : acc;
}
/**
* Destructor needs to be hidden,
* because it causes stack overflow for long lists.
* Custom destruction method `destroy` should be invoked first.
*/
~List() {}
public:
/**
* List needs custom destruction strategy,
* because default destructor causes stack overflow
* in the case of long lists:
* it will recursively remove its items.
*/
List(const List* tail): tail{tail, List::destroy} {}
List(const shared_ptr<const List>& tail): tail{tail} {}
List(const List&) = delete;
List() = delete;
unsigned size() const {
return this->size_();
}
/**
* Public iterface for private `insertBulk_` method.
* It wraps `insertBulk_` result into `shared_ptr`
* with custom destruction function.
*
* Also it creates a guard for tail,
* which will destroy it if something will go wrong.
* `insertBulk_` should store `tail`,
* which is not yet wrapped into `shared_ptr`,
* in the guard, and set it to `nullptr` in the end
* in order to avoid destruction of successfully created list.
*/
static const shared_ptr<const List> insertBulk(unsigned amount) {
struct TailGuard {
const List* ptr;
~TailGuard() {
List::destroy(this->ptr);
}
} guard{};
const List* result = insertBulk_(amount, guard.ptr);
return amount? shared_ptr<const List>{result, List::destroy}
: nullptr;
}
/**
* Custom destruction strategy,
* which should be called in order to delete a list.
*/
static void destroy(const List* list) {
if (!list) return;
shared_ptr<const List> tail = list->tail;
delete list;
/**
* Watching references count allows us to stop,
* when we reached the node,
* which is used by another list.
*
* Also this prevents long loop of construction and destruction,
* because destruction calls this function `destroy` again
* and it will create a lot of redundant entities
* without `tail.use_count() == 1` condition.
*/
for (; tail && tail.use_count() == 1; tail = tail->tail);
}
};
int main() {
/**
* Check whether we can create multiple lists.
*/
const shared_ptr<const List> list{List::insertBulk(1E6)};
const shared_ptr<const List> longList{List::insertBulk(1E7)};
/**
* Check whether we can use a list as a tail for another list.
*/
const shared_ptr<const List> composedList{new List{list}, List::destroy};
/**
* Checking whether creation works well.
*/
assert(list->size() == 1E6);
assert(longList->size() == 1E7);
assert(composedList->size() == 1E6 + 1);
return 0;
}
The List
class without comments and checks in main
function List
类没有注释和main
函数检查
#include <memory>
using std::shared_ptr;
class List {
private:
const shared_ptr<const List> tail;
static const List* insertBulk_(unsigned amount, const List*& tail) {
if (!amount) {
const List* result = tail;
tail = nullptr;
return result;
}
return insertBulk_(amount - 1, tail = new List{tail});
}
~List() {}
public:
List(const List* tail): tail{tail, List::destroy} {}
List(const shared_ptr<const List>& tail): tail{tail} {}
List(const List&) = delete;
List() = delete;
static const shared_ptr<const List> insertBulk(unsigned amount) {
struct TailGuard {
const List* ptr;
~TailGuard() {
List::destroy(this->ptr);
}
} guard{};
const List* result = insertBulk_(amount, guard.ptr);
return amount? shared_ptr<const List>{result, List::destroy}
: nullptr;
}
static void destroy(const List* list) {
if (!list) return;
shared_ptr<const List> tail = list->tail;
delete list;
for (; tail && tail.use_count() == 1; tail = tail->tail);
}
};
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.