Why is number of array accesses during the union and find operations said to be of the order lg(N) in Weighted QuickUnion?

Question

The general code for Quick Union Algorithm.

public class QuickUnionUF
{
 private int[] id;
 public QuickUnionUF(int N)
 {
   id = new int[N];
   for (int i = 0; i < N; i++) id[i] = i;
 }
 private int root(int i)
 {
   while (i != id[i]) i = id[i];
   return i;
 }
 public boolean connected(int p, int q)
 {
   return root(p) == root(q);
 }
 public void union(int p, int q)
 {
   int i = root(p);
   int j = root(q);
   id[i] = j;
 }
}

Here it is clear that for union operations like "initialize", "union" and "find if connected" the number of array accesses will be of the order of N (which is quite clear from the code).

However, my book claims that if we modify the QuickUnion to Weighted QuickUnion then the number of array accesses becomes of the order of lg(N). But I can't see how from the code.

The only change that has been made for Weighted QuickUnion is the part within the union() function, as follows:

 int i = root(p);
 int j = root(q);
 if (i == j) return;
 if (sz[i] < sz[j]) { id[i] = j; sz[j] += sz[i]; }
 else { id[j] = i; sz[i] += sz[j]; }

Here we maintain extra array sz[i] to count number of objects in the tree rooted at i.

But here, I don't see how number of array accesses for union is of the order lg(N). Array accesses must be of the order N as we have to call the root() method twice. Also how is it of the order lg(N) even for "find if connected" operation?

I'm confused as to how are they getting the lg(N). Could someone please explain?

Answer 1

Sure, Now, is clear that if the complexity of root method is order of lg(n), then the union will have lg(n) as well.

The weighted quick union guarantees that lg(n) complexity. Here is how:

The WQU(Weighted quick union algorithm) have the elements saved as several tree structures. The root method finds the root of the tree which contains the element i. So it's complexity is bounded by the maximum height of such a tree.

Now let h(i) be the height of the tree which containts element i and w(i) the size (weight) of that tree. We impose that h(i) <= lg(w(i)). Let's see what happens when we make the union of 2 trees(let them be i and j).

Since we are binding the tree to the root of another. The height of the new tree can be at most max(h(i), h(j)) + 1.

Let's say that w(i) <= w(j), then we bind i to the root of j. If h(j) > h(i) we have nothing to worry about, the height doesn't change. If not, the new height will be h(i)+1. w(i) + w(j) >= 2 * w(i) => lg (w(i) + w(j)) >= lg(2*w(i)) => log (new size) >= 1 + lg(w(i)) >= 1 + h(i). So h(i) + 1 <= log(new size) w(i) + w(j) >= 2 * w(i) => lg (w(i) + w(j)) >= lg(2*w(i)) => log (new size) >= 1 + lg(w(i)) >= 1 + h(i). So h(i) + 1 <= log(new size) thus, the constraint remains (the height of the new tree is smaller than log of the new weight) which means, in the worst case, with a tree of size N, the root method will need at most lg(N) steps.

Answer 2

In the non-modified version, you get the linear dependency because the links to the parent can be arbitrary. So, in the worst case, you might end up with a single long list (ie you have to traverse every other element if you're at the end of the list).

The modification (union-by-rank) aims at producing shallower trees by making the smaller subtree a child of the root of the larger subtree. This heuristic makes the trees much more balanced, ie the length of the path from any leaf to its root becomes O(log n) . Remember that the height of a full binary tree with k nodes is O(log k) .

For a more formal proof, please refer to existing literature.

Additional note: I mentioned that union-by-rank is only a heuristic. Ideally, you would want to make the decision based on the height of both subtrees. However, keeping track of the height is pretty hard. That's why you usually use the size of the subtree, which correlates with its height.

Answer 3

Weighted union preserves the invariant that for every tree with height h and size n , h ≤ log(n)+1 .

This is trivially true for the initial set of trees, with n=1 and h=1 .

Let say we merge two trees with heights h₁, h₂ , and sizes n₁ , n₂ , and n₁ ≥ n₂ .

Weighted union ensures that the new height is either h₁ or h₂ + 1 , and the new size is n₁ + n₂ . In both these cases, the invariant is preserved:

h₁ ≤ log(n₁) + 1 ⇒ h₁ ≤ log(n₁+n₂) + 1

and

h₂ ≤ log(n₂) + 1 ⇒ h₂ + 1 ≤ log(n₂) + 2 ⇒ h₂ + 1 ≤ log(2n₂) + 1 ⇒ h₂ + 1 ≤ log(n₁+n₂) + 1

because n₁ ≥ n₂ .

Why is number of array accesses during the union and find operations said to be of the order lg(N) in Weighted QuickUnion?

Question

3 answers

solution1
0 2017-08-19 10:23:41

solution2
0 2017-08-19 10:23:51

solution3
0 2017-08-19 13:36:28

Why is number of array accesses during the union and find operations said to be of the order lg(N) in Weighted QuickUnion?

Question

3 answers

solution1 0 2017-08-19 10:23:41

solution2 0 2017-08-19 10:23:51

solution3 0 2017-08-19 13:36:28

solution1
0 2017-08-19 10:23:41

solution2
0 2017-08-19 10:23:51

solution3
0 2017-08-19 13:36:28