给定一个数字 k 和一个图表是否有一个 DFS 运行将使森林大于 k

Question

I was given a question that I can't seem to solve.我收到了一个我似乎无法解决的问题。

given a directed graph G=(V,E) and a natural number k, k>0.给定一个有向图 G=(V,E) 和一个自然数 k，k>0。

Find an algorithm that will return "YES" if there is a DFS run that the number of trees in the DFS forest is >= K.如果 DFS 运行 DFS 森林中的树数 >= K，则找到一个返回“YES”的算法。

the algorithm should be linear in the size of the graph G. O(|V| + |E|).该算法在图 G 的大小上应该是线性的。O(|V| + |E|)。

I Will Explain:我会解释：

so for the following run where s is the starting node, we will get 2 trees:因此对于以下运行，其中 s 是起始节点，我们将获得 2 棵树：

but if we stared at the node u we would get only 1 tree.但是如果我们盯着节点 u 我们只会得到一棵树。 so I need to return yes for k = 1 or 2. and no else.所以我需要为 k = 1 或 2 返回 yes。没有别的。

so how can I know the number of trees?那么我怎么知道树的数量呢？

thanks for the help!谢谢您的帮助！

Answer 1

After the edit on the question:在对问题进行编辑后：

Use Kosaraju's algorithm to get strongly connected components in O(V + E) time.使用Kosaraju 算法在 O(V + E) 时间内得到强连通分量。 That would give the max K.这将给出最大 K。

Here max K is same as the number of strongly connected components.这里 max K 与强连通分量的数量相同。

Why?为什么？

Let us do the proof by contradiction now.现在让我们用反证法来证明。 Let us say we have 4 strongly connected components for the graph shown in the question.假设问题中显示的图表有 4 个强连通分量。 Assume there is a possibility of getting an extra dfs tree starting at some node v .假设有可能从某个节点v开始获得额外的 dfs 树。 That would mean either the node v was not covered while counting the number of strongly connected components or the node was missed during DFS.这意味着要么在计算强连接组件的数量时未覆盖节点v ，要么在 DFS 期间错过了节点。 But either of the case is not possible if we do a DFS or find the strongly connected components using well proven algorithm.但是，如果我们进行 DFS 或使用经过充分验证的算法找到强连通分量，则任何一种情况都是不可能的。 Hence, our assumption is false.因此，我们的假设是错误的。 Thus, the proof by contradiction.于是，反证法。

Answer before edit on the question:在编辑问题之前回答：

DFS(Vertex v):
    mark v as visited;
    for(Vertex neighbor: Neighbors of v){
        if(!isVisited(neighbor)){
            DFS(neighbor);
        })
    }

count_trees(Graph G): //V vertices and E edges in total
    for(Vertex v: V){
        if(!isVisited(v)){
            DFS(v);
            trees++;
        })
    }
    return trees;

Above steps are self explanatory.以上步骤是不言自明的。 Maintaining if a vertex has been visited is trivial.维护一个顶点是否被访问是微不足道的。

The above approach is just DFS on every node that hasn't been visited before.上面的方法只是在之前没有访问过的每个节点上进行 DFS。 Hence, the time complexity is the same as that of DFS which is O(|V| + |E|) .因此，时间复杂度与 DFS 相同，为O(|V| + |E|) 。

Answer 2

Imagine that the given graph is actually a tree.想象给定的图实际上是一棵树。 Then if you would start a DFS in the root of the tree, you would find the whole graph in one DFS search.然后，如果您在树的根部启动一个 DFS，您将在一次 DFS 搜索中找到整个图。 On the other extreme, if you would start a DFS in a leaf, and then in the next leaf, and would start every new DFS in the nodes as you would get them bottom-up, then each DFS will only find one node and then quit (because the children would already have been visited by a previous DFS).在另一个极端，如果你在一个叶子中启动一个 DFS，然后在下一个叶子中，并在节点中启动每个新的 DFS，就像你让它们自下而上一样，那么每个 DFS 将只找到一个节点，然后退出（因为之前的 DFS 已经访问过孩子）。 So then you can launch as many DFS searches as there are nodes in the tree.因此，您可以启动与树中的节点一样多的 DFS 搜索。

The same remains true if the graph has some extra edges, but remains acyclic.如果图有一些额外的边，同样如此，但仍然是非循环的。

It becomes different when the graph has cycles.当图形有循环时，它会变得不同。 In that case a DFS that starts in any member of a cycle will find all other members in that cycle.在这种情况下，从循环的任何成员开始的 DFS 将找到该循环中的所有其他成员。 So a cycle can never get split over different DFS searches.因此，一个循环永远不会被不同的 DFS 搜索分割。 This cycle, combined with every other cycle that intersects with it, is a so called Strongly connected component .这个循环，加上与之相交的所有其他循环，就是所谓的强连通分量。

An algorithm would thus have to find strongly connected components and count those as 1 DFS, while all other nodes which do no partake in any cycle can each be counted as a separate DFS (since you would start in the leaves of those subtrees).因此，算法必须找到强连接的组件并将其计为 1 个 DFS，而所有其他不参与任何循环的节点都可以计为一个单独的 DFS（因为您将从这些子树的叶子开始）。

Here is an algorithm that uses DFS (which is confusing, since it is a DFS that is counting possible DFSs) to identify cycles and updates the count accordingly.这是一个使用 DFS 的算法（这很令人困惑，因为它是一个 DFS 计算可能的 DFS）来识别周期并相应地更新计数。 I've used recursion for this algorithm, and so there must be some fast backtracking when the required k has been reached: further searching is not necessary in that case.我已经为这个算法使用了递归，因此当达到所需的k时必须有一些快速回溯：在这种情况下不需要进一步搜索。

All edges are visited only once, and the main loop also visits all nodes exactly once, so the required time complexity is attained.所有边只访问一次，主循环也只访问所有节点一次，因此达到了所需的时间复杂度。

def k_forests(adj, k):
    # pathindex[node] == 0: means node has not been visited
    # pathindex[node] == -1: means node has been visited and all neighbors processed
    # pathindex[node] > 0: means node is at this step in the current DFS path
    pathindex = [0] * len(adj) # none of the nodes has been visited

    def recur(node, depth):
        nonlocal k  # we will decrease this count

        if pathindex[node] > 0: # cycle back
            return pathindex[node]
        if pathindex[node] == -1: # already processed
            return depth
        pathindex[node] = depth # being visited
        cycle = depth + 1 # infinity
        for neighbor in adj[node]:
            cycle = min(cycle, recur(neighbor, depth + 1))
            if k == 0: # success
                return -1 # backtrack completely...
        if cycle >= depth: # no cylce detected or back out of cycle
            k -= 1
            if k == 0:
                return -1 # success
        pathindex[node] = -1 # completely visited and processed
        return cycle

    # main loop over the nodes
    for node in range(len(adj)):
        recur(node, 1)
        if k == 0:
            return "YES"
    return "NO"

This function should be called with an adjacency list for every node, where nodes are identified by a sequential number, starting from 0. For example, the graph in the question can be represented as follows, where s=0, t=1, u=2, v=3, w=4, x=5, y=6, and z=7:调用这个function应该每个节点都有一个邻接表，其中节点是用一个序号标识的，从0开始。例如问题中的图可以表示如下，其中s=0，t=1，u =2，v=3，w=4，x=5，y=6，z=7：

adj = [
    [4, 7],
    [2, 3],
    [3, 1],
    [0, 4],
    [5],
    [7],
    [5],
    [6, 4]
]

print(k_forests(adj, 4)) #  YES
print(k_forests(adj, 5)) #  NO

给定一个数字 k 和一个图表是否有一个 DFS 运行将使森林大于 k

问题描述

2 个解决方案

解决方案1
1 2020-06-28 06:21:55

解决方案2
1 已采纳 2020-06-28 10:30:25

给定一个数字 k 和一个图表是否有一个 DFS 运行将使森林大于 k

问题描述

2 个解决方案

解决方案1 1 2020-06-28 06:21:55

解决方案2 1 已采纳 2020-06-28 10:30:25

解决方案1
1 2020-06-28 06:21:55

解决方案2
1 已采纳 2020-06-28 10:30:25