可分割數據結構（在c ++ 11中）

Question

我想知道是否有人可以幫助我。

我尋找支持這四個操作的數據結構（如列表，隊列，堆棧，數組，向量，二叉樹等）：

isEmpty （true / false）
插入單個元素
pop （即獲取和刪除）單個元素
分成兩個結構，例如取大約一半（比如說+/- 20％）的元素並將它們移動到另一個結構

請注意 ，我根本不關心元素的順序。

插入/彈出示例：

A.insert(1), A.insert(2), A.insert(3), A.insert(4), A.insert(5) // contains 1,2,3,4,5 in any order
A.pop() // 3
A.pop() // 2
A.pop() // 5
A.pop() // 1
A.pop() // 4

和拆分示例：

A.insert(1), A.insert(2), A.insert(3), A.insert(4), A.insert(5)
A.split(B)
// A = {1,4,3}, B={2,5} in any order

我需要結構盡可能快 - 最好是O（1）中的所有四個操作。 我懷疑它已經在std中實現了所以我將自己實現它（在C ++ 11中，所以可以使用std::move ）。

請注意 ， insert ， pop和isEmpty的調用頻率是split的十倍。

我嘗試了一些帶有list和vector 編碼，但沒有成功：

#include <vector>
#include <iostream>

// g++ -Wall -g -std=c++11
/*
output:
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
5 6 7 8 9
*/

int main ()
{
        std::vector<int> v1;

        for (int i = 0; i < 10; ++i) v1.push_back(i);

        for (auto i : v1) std::cout << i << " ";
        std::cout << std::endl;

        auto halfway = v1.begin() + v1.size() / 2;
        auto endItr  = v1.end();

        std::vector<int> v2;
        v2.insert(v2.end(),
                std::make_move_iterator(halfway),
                std::make_move_iterator(endItr));

        // sigsegv
        /*
        auto halfway2 = v1.begin() + v1.size() / 2;
        auto endItr2  = v1.end();
        v2.erase(halfway2, endItr2);
        */

        for (auto i : v1) std::cout << i << " ";
        std::cout << std::endl;

        for (auto i : v2) std::cout << i << " ";
        std::cout << std::endl;

        return 0;
}

任何示例代碼，想法，鏈接或任何有用的？ 謝謝

相關文獻：

如何將矢量的后半部分移動到另一個矢量？ （由於刪除問題，實際上不起作用）
http://www.cplusplus.com/reference/iterator/move_iterator/

Answer 1

由於代碼中的錯誤導致刪除問題。

// sigsegv
auto halfway2 = v1.begin() + v1.size() / 2;
auto endItr2  = v1.end();
v2.erase(halfway2, endItr2);

您嘗試使用指向v1迭代器從v2擦除。 那不行，你可能想在v1上調用erase 。

這解決了拆分向量時的刪除問題，並且向量似乎是您想要的最佳容器。

注意除了split之外的所有東西都可以在向量上的O（1）中完成，如果你只在最后插入，但由於順序對你沒關系我沒有看到任何問題，split將是O（n）在你的實現中，一旦你修復了它，但是這應該非常快，因為數據緊挨着向量中的每一個並且非常緩存友好。

Answer 2

我想不出O（1）中所有操作的解決方案。

使用列表，您可以在O（1）中進行推送和彈出，並在O（n）中進行拆分（因為您需要找到列表的中間部分）。

使用平衡二叉樹 （不是搜索樹），您可以將所有操作都放在O（log n）中。

編輯

有一些建議，保持列表的中間將產生O（1）。 情況並非如此，因為當您分割函數時，您必須計算左側列表的中間位置和右側列表的中間位置，從而得到O（n）。

其他一些建議是，矢量是首選，因為它是緩存友好的。 我完全同意這種說法。

為了好玩，我實現了一個平衡的二叉樹容器，它在O（log n）中執行所有操作。 insert和pop顯然在O（log n）中。 實際的分割是在O（1）中，但是我們留下了根節點，我們必須在其中一半中插入，導致split O（log n）。 但是，不涉及復制。

這是我對所述容器的嘗試（我沒有徹底測試正確性，它可以進一步優化（如在循環中轉換遞歸））。

#include <memory>
#include <iostream>
#include <utility>
#include <exception>

template <class T>
class BalancedBinaryTree {
  private:
    class Node;

    std::unique_ptr<Node> root_;

  public:
    void insert(const T &data) {
      if (!root_) {
        root_ = std::unique_ptr<Node>(new Node(data));
        return;
      }
      root_->insert(data);
    }

    std::size_t getSize() const {
      if (!root_) {
        return 0;
      }
      return 1 + root_->getLeftCount() + root_->getRightCount();
    }

    // Tree must not be empty!!
    T pop() {
      if (root_->isLeaf()) {
        T temp = root_->getData();
        root_ = nullptr;
        return temp;
      }
      return root_->pop()->getData();
    }

    BalancedBinaryTree split() {
      if (!root_) {
        return BalancedBinaryTree();
      }

      BalancedBinaryTree left_half;
      T root_data = root_->getData();
      bool left_is_bigger = root_->getLeftCount() > root_->getRightCount();

      left_half.root_ = std::move(root_->getLeftChild());
      root_ = std::move(root_->getRightChild());

      if (left_is_bigger) {
        insert(root_data);
      } else {
        left_half.insert(root_data);
      }

      return std::move(left_half);
    }
};


template <class T>
class BalancedBinaryTree<T>::Node {
  private:
    T data_;
    std::unique_ptr<Node> left_child_, right_child_;
    std::size_t left_count_ = 0;
    std::size_t right_count_ = 0;

  public:
    Node() = default;
    Node(const T &data, std::unique_ptr<Node> left_child = nullptr,
         std::unique_ptr<Node> right_child = nullptr)
        : data_(data), left_child_(std::move(left_child)),
         right_child_(std::move(right_child)) {
    }

    bool isLeaf() const {
      return left_count_ + right_count_ == 0;
    }

    const T& getData() const {
      return data_;
    }
    T& getData() {
      return data_;
    }

    std::size_t getLeftCount() const {
      return left_count_;
    }

    std::size_t getRightCount() const {
      return right_count_;
    }

    std::unique_ptr<Node> &getLeftChild() {
      return left_child_;
    }
    const std::unique_ptr<Node> &getLeftChild() const {
      return left_child_;
    }
    std::unique_ptr<Node> &getRightChild() {
      return right_child_;
    }
    const std::unique_ptr<Node> &getRightChild() const {
      return right_child_;
    }

    void insert(const T &data) {
      if (left_count_ <= right_count_) {
        ++left_count_;
        if (left_child_) {
          left_child_->insert(data);
        } else {
          left_child_ = std::unique_ptr<Node>(new Node(data));
        }
      } else {
        ++right_count_;
        if (right_child_) {
          right_child_->insert(data);
        } else {
          right_child_ = std::unique_ptr<Node>(new Node(data));
        }
      }
    }

    std::unique_ptr<Node> pop() {
      if (isLeaf()) {
        throw std::logic_error("pop invalid path");
      }

      if (left_count_ > right_count_) {
        --left_count_;
        if (left_child_->isLeaf()) {
          return std::move(left_child_);
        }
        return left_child_->pop();
      }

      --right_count_;
      if (right_child_->left_count_ == 0 && right_child_->right_count_ == 0) {
        return std::move(right_child_);
      }
      return right_child_->pop();
    }
};

用法：

  BalancedBinaryTree<int> t;
  BalancedBinaryTree<int> t2;

  t.insert(3);
  t.insert(7);
  t.insert(17);
  t.insert(37);
  t.insert(1);

  t2 = t.split();

  while (t.getSize() != 0) {
    std::cout << t.pop() << " ";
  }
  std::cout << std::endl;

  while (t2.getSize() != 0) {
    std::cout << t2.pop() << " ";
  }
  std::cout << std::endl;

輸出：

1 17
3 37 7

Answer 3

如果容器中任何時候存儲的元素/字節數很大，Youda008的解決方案（使用列表並跟蹤中間）可能不如您希望的那樣高效。

或者，你可以有一個list<vector<T>>甚至list<array<T,Capacity>> 並跟蹤列表的中間位置，即僅在兩個子容器之間拆分，但從不拆分子容器。 這應該為您提供所有操作的O（1）和合理的緩存效率。 使用array<T,Capacity>如果對於單個值Capacity在任何時候提供您的需要（ Capacity=1 ，這將恢復為一個普通的list ）。 否則，使用vector<T>並根據需要調整新向量的容量。

博洛夫正確地指出，找到分裂列表中出現的列表的中間部分不是O（1）。 這意味着跟蹤中間是沒有用的。 但是，使用list<sub_container>仍然比列表更快 ，因為拆分僅花費O（n / Capacity ）而不是O（n） 。 您為此付出的代價是分割具有Capacity而不是1的顆粒度。因此，您必須在分割的准確性和成本之間進行折衷。

Answer 4

另一種選擇是使用鏈接列表和指向中間元素的指針來實現自己的容器，在該元素中要分割它。 此指針將在每次修改操作時更新。 這樣，您就可以在所有操作上實現O（1）復雜性。

可分割數據結構（在c ++ 11中）

問題描述

4 個解決方案

解決方案1
3 已采納 2014-04-26 09:57:38

解決方案2
2 2014-04-26 10:04:27

解決方案3
1 2014-04-26 11:34:29

解決方案4
0 2014-04-26 11:21:15

可分割數據結構（在c ++ 11中）

問題描述

4 個解決方案

解決方案1 3 已采納 2014-04-26 09:57:38

解決方案2 2 2014-04-26 10:04:27

解決方案3 1 2014-04-26 11:34:29

解決方案4 0 2014-04-26 11:21:15

解決方案1
3 已采納 2014-04-26 09:57:38

解決方案2
2 2014-04-26 10:04:27

解決方案3
1 2014-04-26 11:34:29

解決方案4
0 2014-04-26 11:21:15