为什么这段代码在多次重新计算深度时检查二叉树是否平衡需要时间 O(n log n) ？

Question

This code is meant to check if a binary tree is balanced (balanced being defined as a tree such that the heights of the two subtrees of any node never differ by more than one.此代码旨在检查二叉树是否平衡（平衡被定义为一棵树，使得任何节点的两个子树的高度差异永远不会超过一。

I understand the N part of the runtime O(NlogN).我理解运行时 O(NlogN) 的 N 部分。 The N is because every node in the tree is visited at least once. N 是因为树中的每个节点都至少被访问过一次。

int getHeight(TreeNode root){
    if(root==null) return -1; //Base case
    return Math.max(getHeight(root.left), getHeight(root.right))+1; 
}

boolean isBalanced(TreeNode root){
    if(root == null) return true; //Base case

    int heightDiff = getHeight(root.left) - getHeight(root.right);

    if(Math.abs(heightDiff) > 1){
        return false;
    } else{ //Recurse
        return isBalanced(root.left) && isBalanced(root.right);
    }
}

What I don't understand is the logN part of the runtime O(NlogN).我不明白的是运行时 O(NlogN) 的 logN 部分。 The code will trace every possible path from a node to the bottom of the tree.该代码将跟踪从节点到树底部的每条可能路径。 Therefore should the code be more like N2^N or something?因此，代码应该更像 N2^N 之类的吗？ How does one step by step come to the conclusion that the runtime is O(NlogN)?怎么一步步得出运行时间是O(NlogN)的结论？

Answer 1

I agree with you that the runtime of this code is not necessarily O(n log n).我同意你的看法，这段代码的运行时间不一定是 O(n log n)。 However, I don't believe that it will always trace out every path from a node to the bottom of the tree.但是，我不相信它会始终跟踪从节点到树底部的每条路径。 For example, consider this tree:例如，考虑这棵树：

Here, computing the depths of the left and right subtrees will indeed visit every node once.在这里，计算左右子树的深度确实会访问每个节点一次。 However, because an imbalance is found between the left and right subtrees, the recursion stops without recursively exploring the left subtree.但是，由于发现左右子树之间存在不平衡，因此递归停止而不递归地探索左子树。 In other words, finding an example where the recursion has to do a lot of work is going to require some creativity.换句话说，找到一个递归必须做很多工作的例子需要一些创造力。

You are correct that the baseline check for the height difference will take time Θ(n) because every node must be scanned.您是正确的，高度差的基线检查将花费时间 Θ(n)，因为必须扫描每个节点。 The concern with this code is that it might rescan nodes many, many times as it recomputes the height differences during the recursion.这段代码的问题是它可能会多次重新扫描节点，因为它会在递归过程中重新计算高度差异。 If we want this function to run for a really long time - not necessarily as long as possible, but for a long time - we'd want to make it so that如果我们想让这个函数运行很长时间——不一定是尽可能长，而是很长一段时间——我们想让它这样

the left and right subtrees have roughly the same height, so that the recursion proceeds to the left subtree, but左右子树的高度大致相同，因此递归继续进行到左子树，但是
the tree is extremely imbalanced, placing most of the nodes into the left subtree.树极度不平衡，将大部分节点放置在左子树中。

One way to do this is to create trees where the right subtree is just a long spine that happens to have the same height as the left subtree, but with way fewer nodes.一种方法是创建树，其中右子树只是一个长脊椎，恰好与左子树具有相同的高度，但节点更少。 Here's one possible sequence of trees that has this property:这是具有此属性的一种可能的树序列：

                              *
                             / \
                *           *   *
               / \         / \   \
      *       *   *       *   *   *
     / \     / \   \     / \   \   \
*   *   *   *   *   *   *   *   *   *

Mechanically, each tree is formed by taking the previous tree and putting a rightward spine on top of it.在机械上，每棵树都是通过取前一棵树并在其顶部放置一个向右的脊椎而形成的。 Operationally, these trees are defined recursively as follows:在操作上，这些树被递归定义如下：

An order-0 tree is a single node. 0 阶树是单个节点。
An order-(k+1) tree is a node whose left child is an order-k tree and whose right child is a linked list of height k.阶（k+1）树是左子节点为 k 阶树，右子节点为高度为 k 的链表的节点。

Notice that the number of nodes in an order-k tree is Θ(k ² ).请注意，k 阶树中的节点数为 Θ(k ² )。 You can see this by noticing that the trees have a nice triangular shape, where each layer has one more node in it than the previous one.你可以通过注意到树有一个漂亮的三角形来看到这一点，其中每一层比前一层多一个节点。 Sums of the form 1 + 2 + 3 + ... + k work out to Θ(k ² ), and while we can be more precise than this, there really isn't a need to do so.形式 1 + 2 + 3 + ... + k 的总和计算为 Θ(k ² )，虽然我们可以比这更精确，但确实没有必要这样做。

Now, what happens if we fire off this recursion on the root of any one of these trees?现在，如果我们在这些树中的任何一棵树的根上触发这个递归，会发生什么？ Well, the recursion will begin by computing the heights of the left and right subtrees, which will report that they have the same height as one another.好吧，递归将从计算左子树和右子树的高度开始，这将报告它们彼此具有相同的高度。 It will then recursively explore the left subtree to see whether it's balanced.然后它将递归地探索左子树以查看它是否平衡。 After doing some (large) amount of work, it'll find that the left subtree is not balanced, at which point the recursion won't branch to the right subtree.在做了一些（大量）工作后，它会发现左子树不平衡，此时递归不会分支到右子树。 In other words, the amount of work done on an order-k tree is lower-bounded by换句话说，在 k 阶树上完成的工作量的下限为

W(0) = 1 (there's a single node visited once) , and W(0) = 1 （有一个节点访问了一次） ，并且
W(k+1) = W(k) + Θ(k ² ). W(k+1) = W(k) + Θ(k ² )。

To see where the W(k+1) term comes from, notice that we begin by scanning every node in the tree and there are Θ(k ² ) nodes to scan, then recursively applying the procedure to the left subtree.要查看 W(k+1) 项的来源，请注意我们首先扫描树中的每个节点，并且有 Θ(k ² ) 个节点要扫描，然后将过程递归地应用于左子树。 Expanding this recurrence, we see that in an order-k tree, the total work done is扩展这个循环，我们看到在一个 k 阶树中，完成的总工作是

W(k) = Θ(k ² ) + W(k-1) W(k) = Θ(k ² ) + W(k-1)

= Θ(k ² + (k - 1) ² ) + W(k - 2) = Θ(k ² + (k - 1) ² ) + W(k - 2)

= Θ(k ² + (k - 1) ² + (k - 2) ² ) + W(k - 3) = Θ(k ² + (k - 1) ² + (k - 2) ² ) + W(k - 3)

... ...

= Θ(k ² + (k - 1) ² + ... + 2 ² + 1 ² ) = Θ(k ² + (k - 1) ² + ... + 2 ² + 1 ² )

= Θ(k ³ ). = Θ(k ³ )。

This last step follows from the fact that the sum of the first k cubes works out to Θ(k ³ ).这最后一步遵循以下事实，即前 k 个立方体的总和为 Θ(k ³ )。

To finish things off, we have one more step.为了完成事情，我们还有一步。 We've shown that order-k trees require Θ(k ³ ) total work to process with this recursive algorithm.我们已经证明，使用这种递归算法处理 k 阶树需要 Θ(k ³ ) 总工作量。 However, we'd like a runtime bound in terms of n, the total number of nodes in the tree, not k, the order of the tree.然而，我们想要一个运行时限制，n 是树中节点的总数，而不是 k，树的顺序。 Using the fact that the number of nodes in a tree of order k is Θ(k ² ), we see that a tree with n nodes has order Θ(k ^1/2 ).使用 k 阶树中的节点数为 Θ(k ² ) 这一事实，我们看到具有 n 个节点的树的阶数为 Θ(k ^1/2 )。 Plugging this in, we see that for arbitrarily large n, we can make the total work done equal to Θ((n ^1/2 ) ³ ) = Θ(n ^3/2 ) , which exceeds the O(n log n) proposed bound you mentioned.插入这个，我们看到对于任意大的 n，我们可以使完成的总工作等于 Θ((n ^1/2 ) ³ ) = Θ(n ^3/2 ) ，这超过了提议的 O(n log n)绑定你提到的。 I'm not sure whether this is the worst-case input for this algorithm, but it's certainly not a good one.我不确定这是否是该算法的最坏情况输入，但这肯定不是一个好输入。

So yes, you are correct - the runtime is not O(n log n) in general.所以是的，你是对的 - 运行时通常不是 O(n log n)。 However, it is the case that if the tree is perfectly balanced, the runtime is indeed O(n log n).然而，它的情况是，如果树是完全平衡，运行时确实是为O（n log n）的。 To see why, notice that if the tree is perfectly balanced, each recursive call will要了解原因，请注意，如果树是完美平衡的，则每次递归调用都会

do O(n) work scanning each node in the tree, then做 O(n) 工作扫描树中的每个节点，然后
make two recursive calls on smaller trees, each of which is approximately half as large as the previous one.对较小的树进行两次递归调用，每棵都大约是前一棵树的一半。

That gives the recurrence T(n) = 2T(n / 2) + O(n), which solves to O(n log n).这给出了递归 T(n) = 2T(n / 2) + O(n)，它求解为 O(n log n)。 But that's just one specific case, not the general case.但这只是一种特殊情况，不是一般情况。

A concluding note - with a minor modification, this code can be made to run in time O(n) in all cases.结束语 - 稍加修改，此代码可以在所有情况下在 O(n) 时间内运行。 Instead of recomputing the depth of each node, make an initial pass over the tree and annotate each node with its depth (either by setting some internal field equal to the depth or by having an auxiliary HashMap mapping each node to its depth).不是重新计算每个节点的深度，而是对树进行初始遍历并用其深度注释每个节点（通过设置一些等于深度的内部字段或通过辅助HashMap将每个节点映射到其深度）。 This can be done in time O(n).这可以在 O(n) 时间内完成。 From there, recursively walking the tree and checking whether the left and right subtrees have heights that differ by at most one requires O(1) work per node across n total nodes for a total runtime of O(n).从那里，递归遍历树并检查左子树和右子树的高度是否最多相差 1 需要 O(1) 跨 n 个节点的每个节点工作，总运行时间为 O(n)。

Hope this helps!希望这可以帮助！

为什么这段代码在多次重新计算深度时检查二叉树是否平衡需要时间 O(n log n) ？

问题描述

1 个解决方案

解决方案1
3 已采纳 2019-06-23 18:20:53

为什么这段代码在多次重新计算深度时检查二叉树是否平衡需要时间 O(n log n) ？

问题描述

1 个解决方案

解决方案1 3 已采纳 2019-06-23 18:20:53

解决方案1
3 已采纳 2019-06-23 18:20:53