I am having trouble getting my binary search tree remove function to work properly. It never actually wants to work properly no matter what I do. It always seems to do something the strangest things when I try to get it to function properly.
The node* struct is located in another header file as is the root_ as well (they are setup in the normal right, left, and data stored configuration)
void remove(int value){
node* n = root_;
node* nClone = nullptr;
while (n != nullptr) {//constant checker to ensure that a
if (value > n->value_) { // if value is larger than value stored within node n it will descend further down right
nClone = n; //stores n before continueing
n = n->rhs_;
} else if (value < n->value_) { // if value is less than value stored within node n it will descend further down left
nClone = n; //stores n before continueing
n = n->lhs_;
} else { //if it is equal to the value (there are no other possible outcomes so an else would work) check if there are any subsiquent leaves attached
if (n->lhs_ == nullptr && n->rhs_ == nullptr) { //if both left and right are empty (I.E. no leaves attached to node) set n to nullptr and then delete using free();
nClone->lhs_ = nullptr; //stores both left
nClone->rhs_ = nullptr; // and right leaves as nullptr
free(n); //frees n
n = nullptr;
count_--;//decreases count_/size counter by 1
return; //exits from function as there is nothing more to do
} else if (n->lhs_ == nullptr || n->rhs_ == nullptr) { //if n has one connection whether it be on the left or right it stores itself in nClone and then deletes n
if (n->lhs_ != nullptr) { //if statement to check if left leaf of n exists
nClone->lhs_ = n->lhs_; //if it does it stores n's left leaf in nClone
} else { //if it doesnt have anything stored in the left then there garuntteed is one on the right
nClone->rhs_ = n->rhs_; //stores n's right leaf in nClone
}
free(n);
count_--; //decreases count_/size counter by 1
return; //exits from function as there is nothing more to do
} else {
//for preorder succession
node* nSuc = n->rhs_; //stores right leaf of n in nSuc
while (nSuc->lhs_ != nullptr) { //look for successor
nSuc = nSuc->lhs_;
}
n->value_ = nSuc->value_;
free(n);
count_--;
return;
}
}
}
}
Your code is written as if it was C. C++ is a different language, and you can and should leverage it to your advantage.
The following text is interspersed with a complete, compileable example that you can try online :) It is of course not the only way to implement a tree, and I had to gloss over various details that would be needed to make it at least a full-featured solution. A "real life" implementation may be much more complex, since the "textbook" approach usually doesn't mix well with cache hierarchies on modern CPUs, and a tree like this would be rather slow compared to state-of-the-art implementations. But it does help, I think, to bridge the gap between the pervasive "C-like" way of thinking about trees, and what the modern C++ brings.
Again: This example is minimal, it doesn't do many of the things that would be needed in common practice, but it at least points in the direction away from C, and towards C++, and that's what I intended.
First, let's have a Node
type that uses owning pointers for the child nodes. These pointers automatically manage memory and essentially prevent you from making mistakes that would leak memory or allow use of dangling pointers. The term "owning pointer" means that there's always a well defined owner: it is the pointer itself. Those pointers cannot be, for example, copied - since then you'd have two owners for the same object, and for that you need shared ownership. But shared ownership is hard to get right, since there must be some protocols in place to ensure that you don't get cyclic references. When building a tree, the "parent" node is the natural owner of a "child" node, and thus the unique ownership is precisely what's needed.
// complete example begins
#include <cassert>
#include <memory>
using Value = int;
struct Node {
std::unique_ptr<Node> lhs_;
std::unique_ptr<Node> rhs_;
Value value_;
explicit Node(int value) : value_(value) {}
};
// cont'd.
We also should have a Tree
that owns the root node, and keeps the node count:
// cont'd.
struct Tree {
std::unique_ptr<Node> root_;
int count_ = 0;
};
// cont'd.
When operating on such data structures, you frequently want to have access not only to the value of the node pointer, but also to the pointer itself so that you can modify it. So, we need some sort of a "node reference" that mostly behaves like Node*
would, but which, internally, also carries the address of the pointer to the node, so that eg the node could be replace
d:
// cont'd.
class NodeRef {
std::unique_ptr<Node> *owner;
public:
NodeRef() = delete;
NodeRef(std::unique_ptr<Node> &o) : owner(&o) {}
Node *get() const { return owner->get(); }
// Use -> or * to access the underlying node
Node *operator->() const { return get(); }
Node &operator*() const { return *get(); }
// In boolean contexts, it's true if the Node exists
explicit operator bool() const { return bool(*owner); }
// Replace the Node (if any) with some other one
void replace(std::unique_ptr<Node> &&oldNode) {
*owner = std::move(oldNode);
}
NodeRef &operator=(std::unique_ptr<Node> &val) {
owner = &val;
return *this;
}
};
// cont'd.
NodeRef
holds a pointer to the owner of the node (the owner is the owning pointer type std::unique_ptr
).
The following are the ways that you can use NodeRef
as-if it was Node*
:
NodeRef node = ...;
node->value_ // access to pointed-to Node using ->
(*node).value // access to pointed-to Node using *
if (node) ... // null check
node = otherNode; // assignment from another node (whether owner or NodeRef)
And the following would be the way that NodeRef
behaves similar to std::unique_ptr<Node> &
, ie like a reference to the node owner, allowing you to alter the ownership:
Tree tree;
NodeRef root = tree.root_; // reference the root of the tree
root.replace(std::make_unique<Node>(2)); // replace the root with a new node
Note that this code performs all the necessary memory allocation and deallocation thanks to the power of std::unique_ptr
and move semantics. There are no new
, delete
, malloc
nor free
statements anywhere. And, also, the performance is on par with manual allocations - this code does not use any sort of garbage collection or reference counting. std::unique_ptr
is a tool that lets you leverage the compiler to write memory allocation and deallocation code for you, in a way that's guaranteed to be correct.
But, NodeRef
is not an "observing" pointer, ie if the owner of the node it points to suddely disappears, then NodeRef
becomes dangling. To do otherwise would have more overhead, and would require the use of some tracking pointers, eg shared_ptr
and weak_ptr
, or a bespoke solution - certainly out of scope here.
And thus NodeRef
fulfills the typical requirements that make the actual tree management code much easier to write, understand, and maintain with reduced potential for errors. This approach facilitates code that is correct by design, ie where mistakes that would cause undefined behavior are mostly caught by the compiler, or impossible to write.
Let's see how would a binary node search look, using the types we introduced above:
// cont'd
// Finds the owner of a node that contains a given value,
// or the insertion point where the value would be
NodeRef find(Tree &tree, const Value &value)
{
NodeRef node = tree.root_;
while (node) {
if (value < node->value_)
node = node->lhs_;
else if (node->value_ < value)
node = node->rhs_;
else
break; // we found the value we need
}
return node;
}
// cont'd
First, let's note that while the returned node reference can be null, it doesn't mean that it's "useless". A NodeRef
is never "completely" null, and must always refer to some node owner - that's why the default constructor is deleted, so you can't create an invalid NodeRef
by mistake. It is the node that can be null, not the underlying reference to the owning pointer to the node.
Notice how similar the code is to a version that would use Node *
, yet it is more powerful. Since this version of find
returns a NodeRef
, we can use this reference to replace the node (or set it for the first time if it was null), whereas the signature Node *find(Node *root, const Value &value)
would only give us access to the node itself, but not to its owner. And, in case the node wasn't found, it would return a null pointer - nor bringing us any closer to knowing where to insert the new node, and discarding the work done to find such insertion point (!).
NodeRef
gives us a circumspect access to the parent node: it doesn't expose the entire parent node, but just the owning pointer which owns given node - and it's also more general than a "parent" node would be, since the owning pointer does not need to be even held by a Node
type. And indeed, NodeRef
works just fine when a node's owner is in the Tree
class, or it could refer to a stand-alone pointer as well:
std::unique_ptr<Node> myNode;
NodeRef node = myNode;
// The two lines below are equivalent - both change the `myNode` owning pointer
node.replace(std::make_unique<Node>(42));
myNode = std::make_unique<Node>(42);
In principle, there could be a NodeRef &NodeRef::operator=(std::unique_ptr<Node> &&)
, ie a way to move-assign the node itself, but this would hide the important fact that NodeRef
doesn't really own the node, but only refers to some owner, and the replace
method makes this more explicit: we are replacing the node held by the owner.
Now we can implement the function you sought: node removal. This function takes a NodeRef
, and modifies the subtree at the root of that node, so that the original node is removed:
// cont'd
// Removes the given node. Returns true if the node was removed, or false if
// there was nothing to remove
bool remove(NodeRef node)
{
for (;;) {
if (!node) return false; // the node is empty, nothing to do
if (!node->lhs_) {
// replace the node with its sole right child, if any
node.replace(std::move(node->rhs_));
return true;
}
else if (!node->rhs_) {
// replace the node with its sole left child, if any
node.replace(std::move(node->lhs_));
return true;
}
else {
// node has two children
// 1. take on the largest value in the left subtree
// oldValue is a *reference* to the value of the node being replaced
Value &oldValue = node->value_;
node = node->lhs_;
while (node->rhs_) node = node->rhs_;
// we found the node with a replacement value - substitute it for
// the old value
oldValue = std::move(node->value_);
// 2. remove that child - continue the removal loop
continue;
// instead of continue, we could also do
// remove(node);
// return;
// but by continuing we don't have recursion, and we levarage
// the fact that the `node` references the correct node to remove
}
}
}
// cont'd
We std::move
the values - this is not important at all when dealing with "simple" value types like integers, but would be important if, for example, the Value
was a type that can only be moved but not copied, eg using Value = std::unique_ptr<SomeType>;
.
And now the helper that manages node removal in the Tree
:
// cont'd
void remove(Tree &tree, const Value& value)
{
auto node = find(tree, value);
if (remove(node))
-- tree.count_;
}
// cont'd
Instead of const Value &value
we could have had int value
, but this way it's a more generic approach that would work with other Value
types.
Node insertion is also fairly easy, since find
already provides the insertion point where the value would be, were it to exist:
// cont'd
bool insert(Tree &tree, const Value& value)
{
auto node = find(tree, value);
if (node) {
// Such a value already exists
assert(node->value_ == value);
return false;
} else {
// Insert new value
node.replace(std::make_unique<Node>(value));
++ tree.count_;
return true;
}
}
// cont'd
If Value
was a non-copyable type, then we'd need an insert
signature that takes rvalue reference, ie bool insert(Tree &tree, Value &&value)
.
Now you may ask: how would we "walk" the tree? In C++, the idiomatic way to deal with collections of items is via iterators, and then one can use so-called range-for . The following example prints out the elements of a vector:
std::vector<int> values{1,2,3,4,5};
for (int val : values)
std::cout << val << "\n";
When iterating, or "walking" the tree, we need some "breadcrumbs" to leave behind us, so that we can find our way back up the tree. Those need to reference the node, as well as whether the node was visited or traversed:
// cont'd
#include <functional>
#include <stack>
#include <vector>
// An entry in the node stack used to iterate ("walk") the tree
struct BreadCrumb {
NodeRef node;
bool visited = false; // was this node visited?
bool traversedLeft = false; // was the left child descended into?
bool traversedRight = false; // was the right child descended into?
BreadCrumb(std::unique_ptr<Node> &owner) : node(owner) {}
BreadCrumb(NodeRef node) : node(node) {}
Node *operator->() const { return node.get(); }
explicit operator bool() const { return bool(node); }
};
// cont'd
The "path" that we walk down the tree is kept on a stack dedicated for this purpose:
// cont'd
// A stack holds the path to the current node
class NodeStack {
// Top of stack is the current node
std::stack<BreadCrumb, std::vector<BreadCrumb>> m_stack;
public:
NodeStack() = default;
NodeStack(NodeRef n) { if (n) m_stack.push(n); }
bool empty() const { return m_stack.empty(); }
// The breadcrumb that represents the top of stack, and thus the current node
BreadCrumb &crumb() { return m_stack.top(); }
const BreadCrumb &crumb() const { return m_stack.top(); }
NodeRef node() { return crumb().node; }
Node *node() const { return empty() ? nullptr : crumb().node.get(); }
void push(NodeRef n) { m_stack.push(n); }
// Visit and mark the node if not visited yet
bool visit() {
if (crumb().visited) return false;
crumb().visited = true;
return true;
}
// Descend one level via the left edge if not traversed left yet
bool descendLeft() {
if (crumb().traversedLeft) return false;
crumb().traversedLeft = true;
auto &n = crumb()->lhs_;
if (n) m_stack.push(n);
return bool(n);
}
// Descends one level via right edge if not traversed right yet
bool descendRight() {
if (crumb().traversedRight) return false;
crumb().traversedRight = true;
auto &n = crumb()->rhs_;
if (n) m_stack.push(n);
return bool(n);
}
// Ascends one level
bool ascend() {
m_stack.pop();
return !empty();
}
};
// cont'd
The tree traversal operations are abstracted away in the stack, so that the remaining code is higher level and devoid of such details.
Now we can implement a node iterator that uses the stack to keep its trail of breadcrumbs:
// cont'd
// Node Forward Iterator - iterates the nodes in given order
class NodeIterator {
using Advancer = void (NodeIterator::*)();
NodeStack m_stack; // Breadcrumb path to the current node
Advancer m_advancer; // Method that advances to next node in chosen order
Order m_order = Order::In;
public:
NodeIterator() = default;
// Dereferencing operators
Node& operator*() { return *m_stack.node(); }
Node* operator->() { return m_stack.node().get(); }
// Do the iterators both point to the same node (or no node)?
bool operator==(const NodeIterator &other) const {
return m_stack.node() == other.m_stack.node();
}
bool operator==(decltype(nullptr)) const { return !bool(m_stack.node()); }
bool operator!=(const NodeIterator &other) const { return m_stack.node(); }
bool operator!=(decltype(nullptr)) const { return bool(m_stack.node()); }
NodeIterator(NodeRef n, Order order = Order::In) : m_stack(n) {
setOrder(order);
if (n) operator++(); // Start the traversal
}
void setOrder(Order order) {
if (order == Order::In)
m_advancer = &NodeIterator::advanceInorder;
else if (order == Order::Pre)
m_advancer = &NodeIterator::advancePreorder;
else if (order == Order::Post)
m_advancer = &NodeIterator::advancePostorder;
m_order = order;
}
NodeIterator &operator++() { // Preincrement operator
assert(!m_stack.empty());
std::invoke(m_advancer, this);
return *this;
}
// No postincrement operator since it'd need to copy the stack and thus
// be way too expensive to casually expose via postincrement.
void advanceInorder();
void advancePreorder();
void advancePostorder();
bool goLeft() { return m_stack.descendLeft(); }
bool goRight() { return m_stack.descendRight(); }
};
// cont'd
Remember the stack? It lets us describe the in-, pre- and post-order traversal rather succinctly:
// cont'd
void NodeIterator::advanceInorder() {
for (;;) {
if (m_stack.descendLeft())
continue;
if (m_stack.visit())
break;
if (m_stack.descendRight())
continue;
if (m_stack.ascend())
continue;
assert(m_stack.empty());
break;
}
}
void NodeIterator::advancePreorder() {
for (;;) {
if (m_stack.visit())
break;
if (m_stack.descendLeft())
continue;
if (m_stack.descendRight())
continue;
if (m_stack.ascend())
continue;
assert(m_stack.empty());
break;
}
}
void NodeIterator::advancePostorder() {
for (;;) {
if (m_stack.descendLeft())
continue;
if (m_stack.descendRight())
continue;
if (m_stack.visit())
break;
if (m_stack.ascend())
continue;
assert(m_stack.empty());
break;
}
}
// cont'd
And now we'd want some easy way to use this iterator when we'd wish to iterate a tree rooted in some node:
// cont'd
class TreeRangeAdapter {
NodeRef m_root;
Order m_order;
public:
TreeRangeAdapter(NodeRef root, Order order) :
m_root(root), m_order(order) {}
NodeIterator begin() const { return {m_root, m_order}; }
constexpr auto end() const { return nullptr; }
};
auto inOrder(NodeRef node) { return TreeRangeAdapter(node, Order::In); }
auto preOrder(NodeRef node) { return TreeRangeAdapter(node, Order::Pre); }
auto postOrder(NodeRef node) { return TreeRangeAdapter(node, Order::Post); }
// cont'd
And how would all that work? This is but a simple example of filling up a tree, and in-order traversal:
// cont'd
#include <iostream>
#include <cstdlib>
int main() {
Tree tree;
for (int i = 0; i < 10; ++i) insert(tree, rand() / (RAND_MAX/100));
for (auto &node : inOrder(tree.root_)) {
std::cout << node.value_ << " ";
}
std::cout << "\n";
}
// complete example ends
Output:
19 27 33 39 55 76 78 79 84 91
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.