简体   繁体   中英

Graph reachability in limited space

I have a directed graph G with N vertices, k of which are labeled "terminal". I want to label each vertex v with the set of terminal vertices that are reachable from v. Can I do this in space (R+r)N, where R is the average number of terminal vertices reachable from the nodes of G, and r is a small constant?

To make this more concrete, the data structure would look roughly like this:

struct Node{
  bool isTerminal(); // True if this is a terminal node
  vector<Node*> successors() ; //return the successors of this node
  set<Node*> reachable_terminals; //the value to compute
  bool done; //initially false
}

We want a function

void set_reachables(vector<Node> &); // the "&" means "pass by reference" in C++

That takes a vector of Node representing the vertices in G and sets the "reachable_terminals" member of each Node in G to the terminals reachable from that node.

To make it concrete, N is about 100,000,00 and k is about 150. The average branching factor is about 3 and only about 1000 vertices at the very most are reachable from any particular vertex. (At most ten terminal vertices are typically reachable from any v).

Now, if G were acyclic, a simple depth-first search would work. It's the cycles that cause issues. Also, if space were not a problem I could compute and store the predecessors of each node and then work backward from the terminal nodes, but this takes too much space (note that the successors of a node v are not stored with v but are computed on the fly as necessary), and I would prefer not to have to compute successors() more than once per node.

I am using C++, but any algorithm description is fine.

Edit: Note that DFS for the acyclic case works using an algorithm like this:

void set_reachables(vector<Node>&v){
   for(auto & node:v) node.visit();
}

set<Node*> Node::visit(){
  if (node.isTerminal()) reachableTerminals.insert(this);
  if (done) return reachableTerminals;
  for(auto&node:successors())
    reachableTerminals=set_union(reachableTerminals,node.visit());
  done=true;
  return reachableTerminals;
}

Obviously, this algorithm will fail if the graph is cyclic.

Since the average branching factor is a constant, E = O(V) , where E is the number of edges and V is the number of vertices. Reverse all the edges in your graph. Now, do a DFS starting from every terminal vertex and mark all the vertices reachable from a terminal vertex accordingly. This solves your problem in O(kE) time. While this doesn't hit the complexity you asked for, it might suffice given that you have other strict sparsity conditions on your graph. (Of course, not in general, but am guessing in your case you might have more structure given the others.)

Your problem is a variation on a problem known as the transitive closure of the graph. Boost has an implementation that you can use:

http://www.boost.org/doc/libs/1_55_0/libs/graph/doc/transitive_closure.html

There is a high level outline of the algorithm in the implementation notes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM