可分割数据结构（在c ++ 11中）

Question

我想知道是否有人可以帮助我。

我寻找支持这四个操作的数据结构（如列表，队列，堆栈，数组，向量，二叉树等）：

isEmpty （true / false）
插入单个元素
pop （即获取和删除）单个元素
分成两个结构，例如取大约一半（比如说+/- 20％）的元素并将它们移动到另一个结构

请注意 ，我根本不关心元素的顺序。

插入/弹出示例：

A.insert(1), A.insert(2), A.insert(3), A.insert(4), A.insert(5) // contains 1,2,3,4,5 in any order
A.pop() // 3
A.pop() // 2
A.pop() // 5
A.pop() // 1
A.pop() // 4

和拆分示例：

A.insert(1), A.insert(2), A.insert(3), A.insert(4), A.insert(5)
A.split(B)
// A = {1,4,3}, B={2,5} in any order

我需要结构尽可能快 - 最好是O（1）中的所有四个操作。 我怀疑它已经在std中实现了所以我将自己实现它（在C ++ 11中，所以可以使用std::move ）。

请注意 ， insert ， pop和isEmpty的调用频率是split的十倍。

我尝试了一些带有list和vector 编码，但没有成功：

#include <vector>
#include <iostream>

// g++ -Wall -g -std=c++11
/*
output:
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
5 6 7 8 9
*/

int main ()
{
        std::vector<int> v1;

        for (int i = 0; i < 10; ++i) v1.push_back(i);

        for (auto i : v1) std::cout << i << " ";
        std::cout << std::endl;

        auto halfway = v1.begin() + v1.size() / 2;
        auto endItr  = v1.end();

        std::vector<int> v2;
        v2.insert(v2.end(),
                std::make_move_iterator(halfway),
                std::make_move_iterator(endItr));

        // sigsegv
        /*
        auto halfway2 = v1.begin() + v1.size() / 2;
        auto endItr2  = v1.end();
        v2.erase(halfway2, endItr2);
        */

        for (auto i : v1) std::cout << i << " ";
        std::cout << std::endl;

        for (auto i : v2) std::cout << i << " ";
        std::cout << std::endl;

        return 0;
}

任何示例代码，想法，链接或任何有用的？ 谢谢

相关文献：

如何将矢量的后半部分移动到另一个矢量？ （由于删除问题，实际上不起作用）
http://www.cplusplus.com/reference/iterator/move_iterator/

Answer 1

由于代码中的错误导致删除问题。

// sigsegv
auto halfway2 = v1.begin() + v1.size() / 2;
auto endItr2  = v1.end();
v2.erase(halfway2, endItr2);

您尝试使用指向v1迭代器从v2擦除。 那不行，你可能想在v1上调用erase 。

这解决了拆分向量时的删除问题，并且向量似乎是您想要的最佳容器。

注意除了split之外的所有东西都可以在向量上的O（1）中完成，如果你只在最后插入，但由于顺序对你没关系我没有看到任何问题，split将是O（n）在你的实现中，一旦你修复了它，但是这应该非常快，因为数据紧挨着向量中的每一个并且非常缓存友好。

Answer 2

我想不出O（1）中所有操作的解决方案。

使用列表，您可以在O（1）中进行推送和弹出，并在O（n）中进行拆分（因为您需要找到列表的中间部分）。

使用平衡二叉树 （不是搜索树），您可以将所有操作都放在O（log n）中。

编辑

有一些建议，保持列表的中间将产生O（1）。 情况并非如此，因为当您分割函数时，您必须计算左侧列表的中间位置和右侧列表的中间位置，从而得到O（n）。

其他一些建议是，矢量是首选，因为它是缓存友好的。 我完全同意这种说法。

为了好玩，我实现了一个平衡的二叉树容器，它在O（log n）中执行所有操作。 insert和pop显然在O（log n）中。 实际的分割是在O（1）中，但是我们留下了根节点，我们必须在其中一半中插入，导致split O（log n）。 但是，不涉及复制。

这是我对所述容器的尝试（我没有彻底测试正确性，它可以进一步优化（如在循环中转换递归））。

#include <memory>
#include <iostream>
#include <utility>
#include <exception>

template <class T>
class BalancedBinaryTree {
  private:
    class Node;

    std::unique_ptr<Node> root_;

  public:
    void insert(const T &data) {
      if (!root_) {
        root_ = std::unique_ptr<Node>(new Node(data));
        return;
      }
      root_->insert(data);
    }

    std::size_t getSize() const {
      if (!root_) {
        return 0;
      }
      return 1 + root_->getLeftCount() + root_->getRightCount();
    }

    // Tree must not be empty!!
    T pop() {
      if (root_->isLeaf()) {
        T temp = root_->getData();
        root_ = nullptr;
        return temp;
      }
      return root_->pop()->getData();
    }

    BalancedBinaryTree split() {
      if (!root_) {
        return BalancedBinaryTree();
      }

      BalancedBinaryTree left_half;
      T root_data = root_->getData();
      bool left_is_bigger = root_->getLeftCount() > root_->getRightCount();

      left_half.root_ = std::move(root_->getLeftChild());
      root_ = std::move(root_->getRightChild());

      if (left_is_bigger) {
        insert(root_data);
      } else {
        left_half.insert(root_data);
      }

      return std::move(left_half);
    }
};


template <class T>
class BalancedBinaryTree<T>::Node {
  private:
    T data_;
    std::unique_ptr<Node> left_child_, right_child_;
    std::size_t left_count_ = 0;
    std::size_t right_count_ = 0;

  public:
    Node() = default;
    Node(const T &data, std::unique_ptr<Node> left_child = nullptr,
         std::unique_ptr<Node> right_child = nullptr)
        : data_(data), left_child_(std::move(left_child)),
         right_child_(std::move(right_child)) {
    }

    bool isLeaf() const {
      return left_count_ + right_count_ == 0;
    }

    const T& getData() const {
      return data_;
    }
    T& getData() {
      return data_;
    }

    std::size_t getLeftCount() const {
      return left_count_;
    }

    std::size_t getRightCount() const {
      return right_count_;
    }

    std::unique_ptr<Node> &getLeftChild() {
      return left_child_;
    }
    const std::unique_ptr<Node> &getLeftChild() const {
      return left_child_;
    }
    std::unique_ptr<Node> &getRightChild() {
      return right_child_;
    }
    const std::unique_ptr<Node> &getRightChild() const {
      return right_child_;
    }

    void insert(const T &data) {
      if (left_count_ <= right_count_) {
        ++left_count_;
        if (left_child_) {
          left_child_->insert(data);
        } else {
          left_child_ = std::unique_ptr<Node>(new Node(data));
        }
      } else {
        ++right_count_;
        if (right_child_) {
          right_child_->insert(data);
        } else {
          right_child_ = std::unique_ptr<Node>(new Node(data));
        }
      }
    }

    std::unique_ptr<Node> pop() {
      if (isLeaf()) {
        throw std::logic_error("pop invalid path");
      }

      if (left_count_ > right_count_) {
        --left_count_;
        if (left_child_->isLeaf()) {
          return std::move(left_child_);
        }
        return left_child_->pop();
      }

      --right_count_;
      if (right_child_->left_count_ == 0 && right_child_->right_count_ == 0) {
        return std::move(right_child_);
      }
      return right_child_->pop();
    }
};

用法：

  BalancedBinaryTree<int> t;
  BalancedBinaryTree<int> t2;

  t.insert(3);
  t.insert(7);
  t.insert(17);
  t.insert(37);
  t.insert(1);

  t2 = t.split();

  while (t.getSize() != 0) {
    std::cout << t.pop() << " ";
  }
  std::cout << std::endl;

  while (t2.getSize() != 0) {
    std::cout << t2.pop() << " ";
  }
  std::cout << std::endl;

输出：

1 17
3 37 7

Answer 3

如果容器中任何时候存储的元素/字节数很大，Youda008的解决方案（使用列表并跟踪中间）可能不如您希望的那样高效。

或者，你可以有一个list<vector<T>>甚至list<array<T,Capacity>> 并跟踪列表的中间位置，即仅在两个子容器之间拆分，但从不拆分子容器。 这应该为您提供所有操作的O（1）和合理的缓存效率。 使用array<T,Capacity>如果对于单个值Capacity在任何时候提供您的需要（ Capacity=1 ，这将恢复为一个普通的list ）。 否则，使用vector<T>并根据需要调整新向量的容量。

博洛夫正确地指出，找到分裂列表中出现的列表的中间部分不是O（1）。 这意味着跟踪中间是没有用的。 但是，使用list<sub_container>仍然比列表更快 ，因为拆分仅花费O（n / Capacity ）而不是O（n） 。 您为此付出的代价是分割具有Capacity而不是1的颗粒度。因此，您必须在分割的准确性和成本之间进行折衷。

Answer 4

另一种选择是使用链接列表和指向中间元素的指针来实现自己的容器，在该元素中要分割它。 此指针将在每次修改操作时更新。 这样，您就可以在所有操作上实现O（1）复杂性。

可分割数据结构（在c ++ 11中）

问题描述

4 个解决方案

解决方案1
3 已采纳 2014-04-26 09:57:38

解决方案2
2 2014-04-26 10:04:27

解决方案3
1 2014-04-26 11:34:29

解决方案4
0 2014-04-26 11:21:15

可分割数据结构（在c ++ 11中）

问题描述

4 个解决方案

解决方案1 3 已采纳 2014-04-26 09:57:38

解决方案2 2 2014-04-26 10:04:27

解决方案3 1 2014-04-26 11:34:29

解决方案4 0 2014-04-26 11:21:15

解决方案1
3 已采纳 2014-04-26 09:57:38

解决方案2
2 2014-04-26 10:04:27

解决方案3
1 2014-04-26 11:34:29

解决方案4
0 2014-04-26 11:21:15