简体   繁体   中英

C++: Binary search tree end() iterator

I have basic (no randomization, ordering etc) implementation of the BST. I want add iterators implementations and make the BST suitable for the ranged based for-loop. So I need begin(), end() member fucnctions and iterator incrementing.

I understand what begin() should do -return the iterator to the bottom-left-most node, and this thread discusses different possibilites for traversing the BST (=incrementing the iterator)

But the end() is supposed to give the iterator to the one-past-the-last element. And this is the actual question, that I don't understand, what is the meaning of that in the context of a BST?

The end iterator doesn't necessarily have to be one past the last element (that makes sense for vectors, but less so for trees eg.). It has to just be an iterator that can clearly be identified as not a valid iterator used for indicating that the end of the data structure is reached.

Practically, this can be done in several ways, depending on how your iterator refers to what it's pointing to. If it uses a pointer to a tree node eg., then a null pointer can be used for the end iterator.

A very simple scheme that uses two extra pointers-worth of memory is to simply overlay a doubly-linked, circular list on top of the BST. Your end() iterator then simply points to a sentinel node. It also makes your iterator increment/decrement very simple.

BST::iterator &
BST::iterator::operator++() {
  n = n->next;
  return *this;
}

etc. Note that using a sentinel like this means that the end iterator requires no special treatment. You can decrement it and get exactly the correct behavior.

Despite my comment, Sander De Dycker has the right idea. I have another way to think about it.

All containers that support iterators have a logical ordering. For vector the ordering is based on how the inserts were done - the index/subscript ordering. For map and set it's based on the key ordering. For multimap and multiset it's a bit of both. For unordered_map etc the claim is very tenuous, but I can still argue about hash algorithms and collision handling.

In a logical ordering, you can refer to ordered elements, but sometimes it makes sense to refer to the boundaries between each element. Logically (and in some cases even for the implementation) this works out fairly conveniently...

|     |     |     |     |     |     |     |     |
| +-+ | +-+ | +-+ | +-+ | +-+ | +-+ | +-+ | +-+ |
| |0| | |1| | |2| | |3| | |4| | |5| | |6| | |7| |
| +-+ | +-+ | +-+ | +-+ | +-+ | +-+ | +-+ | +-+ |
|     |     |     |     |     |     |     |     |
0     1     2     3     4     5     6     7     8

You decide where the zero "bound" goes independently of where the zero item goes, but you always get a simple addition/subtraction relationship. If the least bound is numbered the same as the least element, the last bound is numbered one more than the last element. Hence end as one past the final element.

In a binary tree implementation, each node can be considered to have two bounds - one either side of the element. In this scheme, every bound except begin and end occurs twice. You can represent bound 1 using the RHS of element 0 or the LHS or element 1. So in principle you can use a node pointer and a flag. Rather than have two representations for most bounds, though, you'll probably choose the most convenient one where possible - the one where you're not just referring to the right bound but also referring to the element you want to see when you dereference. That means the flag will only be set when referring to end , in which case you shouldn't support dereference anyway.

IOW following through this logic tells you that you don't really need to follow through this logic, though I think it's still a useful mental model. All you really need is an identifiable representation for end . Perhaps it's useful for that representation to include a pointer to the final pointer (as a starting point for eg decrementing that iterator). Perhaps there are situations where it's convenient to have pseudo-iterators internally that recognize the two equivalent bounds as distinct.

Similar but slightly different models and choices arise thinking about eg multiway trees where each node contains an array of elements.

Basically I think it's useful to mentally recognise bound positions as distinct but related to item positions, but that mental model shouldn't constraint your implementation choices - it may inspire alternatives but it's just a mental model.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM