Java TreeSet 奇怪的行为

Question

我正在尝试使用平衡 BST 而不是优先级队列来实现 Prim 的最小生成树算法。 我的实现在 Java 中。 由于 Java 已经以 TreeSet 的形式实现了红黑树的库，我想使用它而不是拥有自己的自定义实现。

Prim 使用 Min Priority Queue 的典型实现需要 O(ElogE)，而我在此实现背后的意图是将时间复杂度降低到 O(ElogV)。 我相信这也可以使用索引优先级队列（IPQ）来完成，但我选择了 BST 版本，因为有可用的自平衡 BST 库实现（在 Java 和 C++ 中）。

对于我测试过这个实现的例子，它似乎工作正常，并且产生了正确的结果。 但是当我进行更深入的检查以确保 TC 实际上是 O(ElogV) 时，我发现 Java TreeSet 出于某种原因对我来说表现得很奇怪。

这是我的实现：

    package graph;

    import java.util.Comparator;
    import java.util.TreeSet;

    /**
     * Implementation of Prim's algorithm (eager version) to
     * find Minimum Spanning Tree using a self-balancing BST
     * Time Complexity: O(ElogV)
     * 
     * This implementation uses a self-balancing BST (specifically Red-Black Tree)
     * 
     * We can do eager Prim's implementation using an Indexed Priority Queue (IPQ) as well
     * 
     * Comparison: IPQ vs BST
     * To get next best edge in IPQ, we pop the min element from root, and 
     * then heapify the tree, which overall takes O(lgn). To get next best edge in 
     * BST, it would take O(lgn) as well, and then we’ll have to delete that entry 
     * which would take another O(lgn), but overall it is still O(lgn)
     * 
     * Insertion into both BST and IPQ takes O(lgn) anyway
     * 
     * Update in IPQ takes O(lgn). Update in BST as well can be done in 
     * O(lgn) [search the element in O(lgn) then delete that entry in another 
     * O(lgn) and then insert new entry with updated edge weight (and source node) 
     * in yet another O(lgn). Intotal, it takes 3*logn but overall still O(lgn)]
     *
     */
    public class PrimsMstUsingSelfBalancingBST extends Graph {

        private int n, m, edgeCount;
        private boolean[] visited;
        private Edge[] mst;
        private double cost;
        private TreeSet<Edge> sortedSet;

        public PrimsMstUsingSelfBalancingBST(int numberOfVertices) {
            super(numberOfVertices);
            n = numberOfVertices;
        }

        public Double mst(int s) {
            m = n - 1; // number of expected edges in mst
            edgeCount = 0;
            mst = new Edge[m];
            visited = new boolean[n];
            sortedSet = new TreeSet<>(getComparator());

            relaxEdgesAtNode(s);

            while (!sortedSet.isEmpty() && edgeCount != m) {
                System.out.println(sortedSet);
                // pollFirst() retrieves and removes smallest element from TreeSet
                Edge edge = sortedSet.pollFirst();
                int nodeIndex = edge.to;

                // skip edges pointing to already visited nodes
                if (visited[nodeIndex]) continue;

                mst[edgeCount] = edge;
                edgeCount++;
                cost += edge.wt;

                relaxEdgesAtNode(nodeIndex);
            }

            return (edgeCount == m) ? cost : null;
        }

        private void relaxEdgesAtNode(int index) {
            visited[index] = true;

            for (Edge edge : adjList.get(index))  {
                int to = edge.to;

                if (visited[to]) continue;

                if (!sortedSet.contains(edge)) {
                    sortedSet.add(edge);
                }
                else {
                    Edge existingEdge = search(edge);
                    if (existingEdge.wt > edge.wt) {
                        sortedSet.remove(existingEdge);
                        sortedSet.add(edge);
                    }
                }
            }
        }

        private Comparator<Edge> getComparator() {
            return new Comparator<Edge>() {
                @Override
                public int compare(Edge e1, Edge e2) {
                    // Java TreeSet is implemented in a way that it uses 
                    // Comparable/Comparator logics for all comparisons.

                    // i.e., it will use this comparator to do comparison 
                    // in contains() method.

                    // It will use this same comparator to do comparison 
                    // during remove() method.

                    // It will also use this same comparator, to perform 
                    // sorting during add() method.

                    // While looking up an edge from contains() or remove(), 
                    // we need to perform check based on destinationNodeIndex,

                    // But to keep elements in sorted order during add() operation
                    // we need to compare elements based on their edge weights

                    // For correct behavior of contains() and remove()
                    if (e1.to == e2.to) return 0;

                    // For correct sorting behavior
                    if (e1.wt > e2.wt) return 1;
                    else if (e1.wt < e2.wt) return -1;

                    // Return -1 or 1 to make sure that different edges with equal 
                    // weights are not ignored by TreeSet.add()
                    // this check can be included in either 'if' or 'else' part 
                    // above. Keeping this separate for readability.
                    return -1;
                }
            };
        }

        // O(log n) search in TreeSet
        private Edge search(Edge e) {
            Edge ceil  = sortedSet.ceiling(e); // smallest element >= e
            Edge floor = sortedSet.floor(e);   // largest element <= e

            return ceil.equals(floor) ? ceil : null; 
        }

        public static void main(String[] args) {
            example1();
        }

        private static void example1() {
            int n = 8;
            PrimsMstUsingSelfBalancingBST graph = 
                    new PrimsMstUsingSelfBalancingBST(n);

            graph.addEdge(0, 1, true, 10);
            graph.addEdge(0, 2, true, 1);
            graph.addEdge(0, 3, true, 4);
            graph.addEdge(2, 1, true, 3);
            graph.addEdge(2, 5, true, 8);
            graph.addEdge(2, 3, true, 2);
            graph.addEdge(3, 5, true, 2);
            graph.addEdge(3, 6, true, 7);
            graph.addEdge(5, 4, true, 1);
            graph.addEdge(5, 7, true, 9);
            graph.addEdge(5, 6, true, 6);
            graph.addEdge(4, 1, true, 0);
            graph.addEdge(4, 7, true, 8);
            graph.addEdge(6, 7, true, 12);

            int s = 0;
            Double cost = graph.mst(s);
            if (cost != null) {
                System.out.println(cost); // 20.0
                for (Edge e : graph.mst)
                    System.out.println(String.format("Used edge (%d, %d) with cost: %.2f", e.from, e.to, e.wt));
                /*
                 * Used edge (0, 2) with cost: 1.00
                 * Used edge (2, 3) with cost: 2.00
                 * Used edge (3, 5) with cost: 2.00
                 * Used edge (5, 4) with cost: 1.00
                 * Used edge (4, 1) with cost: 0.00
                 * Used edge (5, 6) with cost: 6.00
                 * Used edge (4, 7) with cost: 8.00
                 */
            }
            else {
                System.out.println("MST not found!");
            }
        }
    }

下面是我正在测试的无向加权图（代码中也使用了相同的示例）

我面临的问题是 TreeSet 似乎正在添加重复条目，因为它的 contains() 方法有时会返回 false，即使具有相同键的相应条目（在这种情况下为边的目标节点）已经存在。

下面是上述程序的output：

[{from=0, to=2, weight=1.00}, {from=0, to=3, weight=4.00}, {from=0, to=1, weight=10.00}]
[{from=2, to=3, weight=2.00}, {from=2, to=1, weight=3.00}, {from=2, to=5, weight=8.00}, {from=0, to=1, weight=10.00}]
[{from=3, to=5, weight=2.00}, {from=2, to=1, weight=3.00}, {from=3, to=6, weight=7.00}, {from=0, to=1, weight=10.00}]
[{from=5, to=4, weight=1.00}, {from=2, to=1, weight=3.00}, {from=5, to=6, weight=6.00}, {from=5, to=7, weight=9.00}, {from=0, to=1, weight=10.00}]
[{from=4, to=1, weight=0.00}, {from=5, to=6, weight=6.00}, {from=4, to=7, weight=8.00}, {from=0, to=1, weight=10.00}]
[{from=5, to=6, weight=6.00}, {from=4, to=7, weight=8.00}, {from=0, to=1, weight=10.00}]
[{from=4, to=7, weight=8.00}, {from=0, to=1, weight=10.00}]
20.0
Used edge (0, 2) with cost: 1.00
Used edge (2, 3) with cost: 2.00
Used edge (3, 5) with cost: 2.00
Used edge (5, 4) with cost: 1.00
Used edge (4, 1) with cost: 0.00
Used edge (5, 6) with cost: 6.00
Used edge (4, 7) with cost: 8.00

可以清楚地看到，即使目标节点 1 已经有一个条目为 {from=0, to=1, weight=10.00}) [line1 of output]，它也会为其添加另一个条目为 {from=2 , to=1, weight=3.00} [line2 of output] 而不是更新现有条目。

当我通过在自定义比较器中添加断点来调试它时，我注意到从未为现有条目调用比较器，因此没有发生与现有条目的比较。 例如在这种情况下，在处理边 {from=2, to=1, weight=3.00} Comparator.compare() 被调用条目 {from=2, to=3, weight=2.00} 和 {from=2, to=5, weight=8.00} 但没有为入口 {from=0, to=1, weight=10.00} 调用，因此它得出结论没有目标节点 1 的入口，因此它添加了一个新的条目，因此我得到目标节点 1 [line2 of output] 的两个条目

我怀疑这与 Java Collections 框架中的对象的不变性和并发修改限制有关。 但我无法理解问题的根本原因。

任何帮助表示赞赏。

Answer 1

You Comparator违反了其合同，例如

实现者必须确保所有x和y的sgn(compare(x, y)) == -sgn(compare(y, x)) 。 （这意味着当且仅当compare(y, x)抛出异常时compare(x, y)必须抛出异常。）

这是compare方法，没有所有注释：

public int compare(Edge e1, Edge e2) {
    if (e1.to == e2.to) return 0;

    if (e1.wt > e2.wt) return 1;
    else if (e1.wt < e2.wt) return -1;

    return -1;
}

例如，您有两个权重为 1 的边：

a = {from=0, to=2, weight=1.00}
b = {from=5, to=4, weight=1.00}

由于它们的to值不同，但wt值相同，因此无论参数顺序如何，该方法都会返回-1 ，即compare(a, b) = -1和compare(b, a) = -1 。

这违反了上面列出的规则，并将导致不可预知的行为。

Java TreeSet 奇怪的行为

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-05-10 10:36:58

Java TreeSet 奇怪的行为

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-05-10 10:36:58

解决方案1
2 已采纳 2020-05-10 10:36:58