简体   繁体   English

最小生成树 (MST) 算法变体

[英]Minimum Spanning Tree (MST) algorithm variation

I was asked the following question in an interview and I am unable to find an efficient solution.我在一次采访中被问到以下问题,我无法找到有效的解决方案。

Here is the problem:这是问题所在:

  • We want to build a network and we are given c nodes/cities and D possible edges/connections made by roads.我们想要建立一个网络,我们得到了 c 节点/城市和 D 个可能的由道路构成的边缘/连接。 Edges are bidirectional and we know the cost of the edge.边缘是双向的,我们知道边缘的成本。 The costs of the edges can be represented as d[i,j] which denotes the cost of the edge ij.边的成本可以表示为 d[i,j],它表示边 ij 的成本。 Note not all c nodes can be directly connected to each other (D is the set of possible edges).请注意,并非所有 c 节点都可以直接相互连接(D 是可能边的集合)。

  • Now we are given a list of k potential edges/connections that have no cost.现在我们得到了 k 个没有成本的潜在边/连接的列表。 However, you can only choose one edge in the list of k edges to use (like getting free funding to build an airport between two cities).但是,您只能在 k 条边列表中选择一条边来使用(例如获得免费资金在两个城市之间建造机场)。

So the question is... find the set of roads (and the one free airport) that minimizes total cost required to build the network connecting all cities in an efficient runtime.所以问题是......找到一组道路(和一个免费机场),以最大限度地减少在高效运行时构建连接所有城市的网络所需的总成本。

So in short, solve a minimum spanning tree problem but where you can choose 1 edge in a list of k potential edges to be free of cost.所以简而言之,解决一个最小生成树问题,但是你可以在 k 个潜在边的列表中选择 1 条边是免费的。 I'm unsure how to solve... I've tried finding all the spanning trees in order of increasing cost and choosing the lowest cost, but I'm still challenged on how to consider the one free edge from the list of k potential free edges.我不确定如何解决......我已经尝试按照增加成本的顺序找到所有生成树并选择最低成本,但我仍然面临如何从 k 潜力列表中考虑一个自由边缘的挑战自由边缘。 I've also tried finding the MST of the D potential connections and then adjusting it according the the options in k to get a result.我还尝试找到 D 电位连接的 MST,然后根据 k 中的选项对其进行调整以获得结果。

Thank you for any help!感谢您的任何帮助!

First generate a MST.首先生成一个 MST。 Now, if you add a free edge, you will create exactly one cycle.现在,如果您添加一条自由边,您将只创建一个循环。 You could then remove the heaviest edge in the cycle to get a cheaper tree.然后,您可以删除循环中最重的边缘以获得更便宜的树。

To find the best tree you can make by adding one free edge, you need to find the heaviest edge in the MST that you could replace with a free one.要找到通过添加一条空闲边可以制作的最佳树,您需要找到 MST 中可以用空闲边替换的最重边。

You can do that by testing one free edge at a time:您可以通过一次测试一个自由边来做到这一点:

  1. Pick a free edge选择一个自由边缘
  2. Find the lowest common ancestor in the tree (from an arbitrary root) of its adjacent vertices找到其相邻顶点的树中最低的共同祖先(从任意根)
  3. Remember the heaviest edge on the path between the free edge vertices记住自由边顶点之间路径上最重的边

When you're done, you know which free edge to use -- it's the one associated with the heaviest tree edge, and you know which edge it replaces.完成后,您知道要使用哪条空闲边——它与最重的树边相关联,并且您知道它替换了哪条边。

In order to make steps (2) and (3) faster, you can remember the depth of each node and connect it to multiple ancestors like a skip list.为了使步骤(2)和(3)更快,您可以记住每个节点的深度并将其连接到多个祖先,如跳过列表。 You can then do those steps in O(log |V|) time, leading to a total complexity of O( (|E|+k) log |V| ), which is pretty good.然后,您可以在 O(log |V|) 时间内完成这些步骤,从而获得 O( (|E|+k) log |V| ) 的总复杂度,这非常好。

EDIT: Even Easier Way编辑:更简单的方法

After thinking about this a bit, it seems there's a super easy way to figure out which free edge to use and which MST edge to replace.在考虑了这一点之后,似乎有一种超级简单的方法可以确定要使用哪个空闲边缘以及要替换哪个 MST 边缘。

Disregarding the k possible free edges, you build the MST from the other edges using Kruskal's algorithm, but you modify the usual disjoint set data structure as follows:忽略k个可能的自由边,您使用 Kruskal 算法从其他边构建 MST,但您修改通常的不相交集数据结构如下:

  • Use union by size or rank, but not path compression.按大小或等级使用联合,但使用路径压缩。 Every union operation will then establish exactly one link, and take O(log N) time, and all path lengths will be at most O(log N) long.然后每个联合操作将准确建立一个链接,并花费 O(log N) 时间,所有路径长度最多为 O(log N) 长。
  • For each link, remember the index of the edge that caused it to be created.对于每个链接,请记住导致创建它的边的索引。

For each possible free edge, then, you can walk up the links in the disjoint set structure to find out exactly at which point its endpoints were connected into the same connected component.然后,对于每个可能的自由边,您可以沿着不相交集结构中的链接向上走,以准确找出其端点在哪个点连接到相同的连接组件。 You get the index of the last required edge, ie, the one it would replace, and the free edge with the greatest replacement target index is the one you should use.你得到最后一个需要的边的索引,即它将替换的那个,并且具有最大替换目标索引的自由边是你应该使用的那个。

One idea would be to treat your favorite MST algorithm as a black box and to think about changing the edges in the graph before asking for the MST.一个想法是将您最喜欢的 MST 算法视为一个黑匣子,并在请求 MST 之前考虑更改图中的边。 For example, you could try something like this:例如,您可以尝试这样的事情:

for each edge in the list of possible free edges:
    make the graph G' formed by setting that edge cost to 0.
    compute the MST of G'

return the cheapest MST out of all the ones generated this way

The runtime of this approach is O(kT(m, n)), where k is the number of edges to test and T(m, n) is the cost of computing an MST using your favorite black-box algorithm.这种方法的运行时间是 O(kT(m, n)),其中 k 是要测试的边数,T(m, n) 是使用您最喜欢的黑盒算法计算 MST 的成本。

We can do better than this.我们可以做得比这更好。 There's a well-known problem of the following form:有以下形式的众所周知的问题:

Suppose you have an MST T for a graph G. You then reduce the cost of some edge {u, v}.假设你有一个图 G 的 MST T。然后你减少了一些边 {u, v} 的成本。 Find an MST T' in the new graph G'.在新图 G' 中找到一个 MST T'。

There are many algorithms for solving this problem efficiently.有许多算法可以有效地解决这个问题。 Here's one:这是一个:

Run a DFS in T starting at u until you find v.
If the heaviest edge on the path found this way costs more than {u, v}:
   Delete that edge.
   Add {u, v} to the spanning tree.
Return the resulting tree T'.

(Proving that this works is tedious but doable.) This would give an algorithm of cost O(T(m, n) + kn), since you would be building an initial MST (time T(m, n)), then doing k runs of DFS in a tree with n nodes. (证明这个工作很乏味但可行。)这将给出一个成本 O(T(m, n) + kn) 的算法,因为您将构建一个初始 MST(时间 T(m, n)),然后执行在具有 n 个节点的树中运行 k 次 DFS。

However, this can potentially be improved even further if you're okay using some more advanced algorithms.但是,如果您可以使用一些更高级的算法,这可能会进一步改进。 The paper "On Cartesian Trees and Range Minimum Queries" by Demaine et al shows that in O(n) time, it is possible to preprocess a minimum spanning tree so that, in time O(1), queries of the form "what is the lowest-cost edge on the path in this tree between nodes u and v?" Demaine 等人的论文“关于笛卡尔树和范围最小查询”表明,在 O(n) 时间内,可以预处理最小生成树,以便在 O(1) 时间内,查询“什么是这棵树中节点 u 和 v 之间路径上成本最低的边?” in time O(1).在时间 O(1)。 You could therefore build this structure instead of doing a DFS to find the bottleneck edge between u and v, reducing the overall runtime to O(T(m, n) + n + k).因此,您可以构建此结构而不是执行 DFS 来查找 u 和 v 之间的瓶颈边缘,从而将整体运行时间减少到 O(T(m, n) + n + k)。 Given that T(m, n) is very low (the best known bound is O(m α(m)), where α(m) is the Ackermann inverse function and is less than five for all inputs in the feasible univers), this is asymptotically a very quick algorithm!鉴于 T(m, n) 非常低(最知名的界限是 O(m α(m)),其中 α(m) 是阿克曼逆 function 并且对于可行宇宙中的所有输入都小于 5),这是渐近的一个非常快速的算法!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM