简体   繁体   English

共享指针是否会破坏尾调用优化?

[英]Does shared pointer break tail call optimization?

Preface 前言

I'm practicing C++ and trying to implement immutable list. 我正在练习C ++并尝试实现不可变列表。 In one of my tests I'm trying to create a list with lot of values (1 million nodes) recursively. 在我的一个测试中,我试图以递归方式创建一个包含大量值(100万个节点)的列表。 All values are const , so I cannot perform regular loop, also this isn't functional enough, you know. 所有值都是const ,所以我不能执行常规循环,这也不够功能 ,你知道。 The test fails with Segmentation fault . 测试失败,出现Segmentation fault

My system is 64-bit Xubuntu 16.04 LTS with Linux 4.4. 我的系统是带有Linux 4.4的64位Xubuntu 16.04 LTS。 I compile the my code with g++ 5.4 and clang++ 3.8 using --std=c++14 -O3 flags. 我用g ++ 5.4和clang ++ 3.8使用--std=c++14 -O3标志编译我的代码。

Source code 源代码

I've written a simple example, which shows the situation, when tail call should be easily optimized, but something goes wrong and Segmentation fault appears. 我写了一个简单的例子,它显示了尾部调用应该很容易优化的情况,但出现问题并出现Segmentation fault The function f just waits amount iterations and then creates a pointer to single int and returns it 函数f只是等待amount的迭代,然后创建一个指向单个int ,并返回它

#include <memory>

using std::shared_ptr;

shared_ptr<int> f(unsigned amount) {
    return amount? f(amount - 1) : shared_ptr<int>{new int};
}

int main() {
    return f(1E6) != nullptr;
}

Note this example fails only with g++ , while clang++ makes it okay. 请注意,此示例仅使用g++失败,而clang++使其正常。 Though, on more complicated example it doesn't optimize too. 虽然,在更复杂的例子中它也没有优化。

Here is an example of a simple list with recursive insertion of elements. 下面是一个带有递归插入元素的简单列表的示例。 Also I've added destroy function, which helps to avoid stack overflow during destruction. 此外,我添加了destroy函数,这有助于避免在销毁期间堆栈溢出。 Here I get Segmentation fault with both compilers 在这里,我得到两个编译器的Segmentation fault

#include <memory>

using std::shared_ptr;

struct L {
    shared_ptr<L> tail;

    L(const L&) = delete;
    L() = delete;
};

shared_ptr<L> insertBulk(unsigned amount, const shared_ptr<L>& tail) {
    return amount? insertBulk(amount - 1, shared_ptr<L>{new L{tail}})
                 : tail;
}

void destroy(shared_ptr<L> list) {
    if (!list) return;

    shared_ptr<L> tail = list->tail;
    list.reset();

    for (; tail; tail = tail->tail);
}

int main() {
    shared_ptr<L> list = shared_ptr<L>{new L{nullptr}};
    destroy(insertBulk(1E6, list));
    return 0;
}

NOTE Implementation with regular pointers is optimized well by both compilers. 注意两个编译器都很好地优化了常规指针的实现。

Question

Does shared_ptr really break tail call optimization in my case? 在我的情况下, shared_ptr真的打破了尾部调用优化吗? Is it a compilers' issue or a problem with shared_ptr implementation? 它是编译器的问题还是shared_ptr实现的问题?

Answer 回答

Short answer is: yes and no. 简短的回答是:是和否。

Shared pointer in C++ doesn't break tail call optimization, but it complicates creation of such recursive function, that can be converted to loop by compiler. C ++中的共享指针不会破坏尾部调用优化,但它会使这种递归函数的创建变得复杂,可以通过编译器将其转换为循环。

Details 细节

Avoiding stack overflow during recursive construction of a long list 在递归构建长列表期间避免堆栈溢出

I've recalled that shared_ptr has a destructor and C++ has RAII. 我记得shared_ptr有一个析构函数,而C ++有RAII。 This makes construction of optimizable tail call harder, as it was discussed in Can Tail Call Optimization and RAII Co-Exist? 这使得优化尾部调用的构建更加困难,正如Can Tail Call Optimization和RAII Co-Exist中所讨论的那样 question. 题。

@KennyOstrom has proposed to use an ordinary pointer to solve this problem @KennyOstrom建议使用普通指针来解决这个问题

static const List* insertBulk_(unsigned amount, const List* tail=nullptr) {
    return amount? insertBulk_(amount - 1, new List{tail})
                 : tail;
}

The following constructor is used 使用以下构造函数

List(const List* tail): tail{tail} {}

When tail of List is an instance of shared_ptr , tail call is successfully optimized. tailList是一个实例shared_ptr ,尾调用成功优化。

Avoiding stack overflow during destruction 在销毁期间避免堆栈溢出

Custom destruction strategy is needed. 需要定制销毁策略。 Fortunately, shared_ptr allows us to set it, so I've hidden destructor of List by making it private , and use this for list destruction 幸运的是, shared_ptr允许我们设置它,所以我已经隐藏的析构函数List通过使private ,并用这个列表毁灭

static void destroy(const List* list) {
    if (!list) return;

    shared_ptr<const List> tail = list->tail;
    delete list;
    for (; tail && tail.use_count() == 1; tail = tail->tail);
}

Constructor should pass this destruction function to tail initialization list 构造函数应该将此销毁函数传递给tail初始化列表

List(const List* tail): tail{tail, List::destroy} {}

Avoiding memory leak 避免内存泄漏

In the case of exceptions I'll not have proper cleanup, so the problem's not solved yet. 在例外的情况下,我没有适当的清理,所以问题还没有解决。 I want to use shared_ptr because it's safe, but now I don't use it for current list head until the end of construction. 我想使用shared_ptr因为它是安全的,但现在我不会将它用于当前列表头,直到构造结束。

It's needed to watch the "naked" pointer until it's wrapped into shared pointer, and free it in the case of emergency. 它需要观察“裸”指针,直到它被包装成共享指针,并在紧急情况下释放它。 Let's pass a reference to tail pointer instead of a pointer itself to insertBulk_ . 让我们将尾指针的引用传递给insertBulk_而不是指针本身。 This will allow the last good pointer to be visible outside of the function 这将允许最后一个好指针在函数外部可见

static const List* insertBulk_(unsigned amount, const List*& tail) {
    if (!amount) {
        const List* result = tail;
        tail = nullptr;
        return result;
    }
    return insertBulk_(amount - 1, tail = new List{tail});
}

Then analogue of Finally is needed in order to automate destruction of a pointer, which will leak in the case of exception 然后需要模拟Finally ,以便自动销毁指针,在异常情况下会泄漏

static const shared_ptr<const List> insertBulk(unsigned amount) {
    struct TailGuard {
        const List* ptr;
        ~TailGuard() {
            List::destroy(this->ptr);
        }
    } guard{};
    const List* result = insertBulk_(amount, guard.ptr);
    return amount? shared_ptr<const List>{result, List::destroy}
                 : nullptr;
}

Solution

Now, I guess, the problem is solved: 现在,我猜,问题解决了:

  • g++ and clang++ successfully optimize recursive creation of long lists; g++clang++成功优化了长列表的递归创建;
  • lists still use shared_ptr ; 列表仍然使用shared_ptr ;
  • ordinary pointers seem to be in safety. 普通指针似乎是安全的。

Source code 源代码

The final code is 最终的代码是

#include <memory>
#include <cassert>

using std::shared_ptr;

class List {
    private:
        const shared_ptr<const List> tail;

        /**
         * I need a `tail` to be an instance of `shared_ptr`.
         * Separate `List` constructor was created for this purpose.
         * It gets a regular pointer to `tail` and wraps it
         * into shared pointer.
         *
         * The `tail` is a reference to pointer,
         * because `insertBulk`, which called `insertBulk_`,
         * should have an ability to free memory
         * in the case of `insertBulk_` fail
         * to avoid memory leak.
         */
        static const List* insertBulk_(unsigned amount, const List*& tail) {
            if (!amount) {
                const List* result = tail;
                tail = nullptr;
                return result;
            }
            return insertBulk_(amount - 1, tail = new List{tail});
        }
        unsigned size_(unsigned acc=1) const {
            return this->tail? this->tail->size_(acc + 1) : acc;
        }
        /**
         * Destructor needs to be hidden,
         * because it causes stack overflow for long lists.
         * Custom destruction method `destroy` should be invoked first.
         */
        ~List() {}
    public:
        /**
         * List needs custom destruction strategy,
         * because default destructor causes stack overflow
         * in the case of long lists:
         * it will recursively remove its items.
         */
        List(const List* tail): tail{tail, List::destroy} {}
        List(const shared_ptr<const List>& tail): tail{tail} {}
        List(const List&) = delete;
        List() = delete;

        unsigned size() const {
            return this->size_();
        }

        /**
         * Public iterface for private `insertBulk_` method.
         * It wraps `insertBulk_` result into `shared_ptr`
         * with custom destruction function.
         *
         * Also it creates a guard for tail,
         * which will destroy it if something will go wrong.
         * `insertBulk_` should store `tail`,
         * which is not yet wrapped into `shared_ptr`,
         * in the guard, and set it to `nullptr` in the end
         * in order to avoid destruction of successfully created list.
         */
        static const shared_ptr<const List> insertBulk(unsigned amount) {
            struct TailGuard {
                const List* ptr;
                ~TailGuard() {
                    List::destroy(this->ptr);
                }
            } guard{};
            const List* result = insertBulk_(amount, guard.ptr);
            return amount? shared_ptr<const List>{result, List::destroy}
                         : nullptr;
        }
        /**
         * Custom destruction strategy,
         * which should be called in order to delete a list.
         */
        static void destroy(const List* list) {
            if (!list) return;

            shared_ptr<const List> tail = list->tail;
            delete list;

            /**
             * Watching references count allows us to stop,
             * when we reached the node,
             * which is used by another list.
             *
             * Also this prevents long loop of construction and destruction,
             * because destruction calls this function `destroy` again
             * and it will create a lot of redundant entities
             * without `tail.use_count() == 1` condition.
             */
            for (; tail && tail.use_count() == 1; tail = tail->tail);
        }
};

int main() {
    /**
     * Check whether we can create multiple lists.
     */
    const shared_ptr<const List> list{List::insertBulk(1E6)};
    const shared_ptr<const List> longList{List::insertBulk(1E7)};
    /**
     * Check whether we can use a list as a tail for another list.
     */
    const shared_ptr<const List> composedList{new List{list}, List::destroy};
    /**
     * Checking whether creation works well.
     */
    assert(list->size() == 1E6);
    assert(longList->size() == 1E7);
    assert(composedList->size() == 1E6 + 1);
    return 0;
}

Shorter version of the source code 较短版本的源代码

The List class without comments and checks in main function List类没有注释和main函数检查

#include <memory>

using std::shared_ptr;

class List {
    private:
        const shared_ptr<const List> tail;

        static const List* insertBulk_(unsigned amount, const List*& tail) {
            if (!amount) {
                const List* result = tail;
                tail = nullptr;
                return result;
            }
            return insertBulk_(amount - 1, tail = new List{tail});
        }
        ~List() {}
    public:
        List(const List* tail): tail{tail, List::destroy} {}
        List(const shared_ptr<const List>& tail): tail{tail} {}
        List(const List&) = delete;
        List() = delete;

        static const shared_ptr<const List> insertBulk(unsigned amount) {
            struct TailGuard {
                const List* ptr;
                ~TailGuard() {
                    List::destroy(this->ptr);
                }
            } guard{};
            const List* result = insertBulk_(amount, guard.ptr);
            return amount? shared_ptr<const List>{result, List::destroy}
                         : nullptr;
        }
        static void destroy(const List* list) {
            if (!list) return;

            shared_ptr<const List> tail = list->tail;
            delete list;

            for (; tail && tail.use_count() == 1; tail = tail->tail);
        }
};

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM