简体   繁体   中英

How to find shortest paths between specific set of nodes using networkx

I have an undirected graph G with hundreds of nodes.

A sample.network of G can be created as:

G = nx.Graph()
G.add_nodes_from([0, 1, 2, 3, 4, 5])
G.add_edges_from([(0, 1), (1, 2), (1, 3), (3, 4), (3, 5)])

I want to find the shortest paths between nodes that have degree of 1. These nodes in the above sample are 0, 2, 4, 5 .

I have the following function that returns nodes with degree = 1:

def find_leaf_nodes(g):
  leaf_nodes = ()
  for d in list(g.degree()):
    if d[1] == 1:
      val = d[0]
      leaf_nodes = leaf_nodes + (val,)
  return leaf_nodes

l_nodes = find_leaf_nodes(G)
print(l_nodes)

Now I want to get the shortest paths between nodes in l_nodes only and not all nodes in G . For example, from node 0 to 2 , 0 to 4 , 2 to 5 , etc. However when I use:

paths = nx.shortest_path(G, source=l_nodes, target=l_nodes, weight='cost')

I get the same tuple returned by find_leaf_nodes function and not the shortest paths.

I would expect something like:

{0: {0: [0],
  1: [0, 3, 1],
  2: [0, 3, 2],
  3: [0, 3], ...}

Instead I get a tuple that look something like:

[(0, 2, 1, 3, ...)]

Is there a function other than shortest_path that can take a set of same nodes as source and target to find the shortest paths among them?

If there is no such function, how can I find the shortest paths among certain nodes (specifically nodes with degree of 1)?

First , how to get what you're after. I don't see an incredibly efficient algorithm. Looking at the methods in.networkx that do something like this, there is all_pairs_shortest_path , which turns out to just do:

for n in G:
        yield (n, single_source_shortest_path(G, n, cutoff=cutoff))

So it just checks for each node what all shortest paths are from that node. If you aren't going to be doing this a lot, I would adapt this code and replace for n in G with for n in l_nodes . The problem is that this will find the shortest path from every leaf node to every other node in the graph. Then I would discard all the ones that don't end in another leaf.

If you're going to do this more often, then I would get into the guts of single_source_shortest_path (available here ) and set up a test that will stop calculation once I find all the other leaf nodes. Note that you'll still have to trace much of the graph first until you find all the leaf nodes, so I'm not sure this will substantially improve matters.

If there is a cost associated with the edges, then instead you should look at the dijkstra shortest path length methods. The all_pairs... equivalent is similarly simple, just calling the dijkstra algorithm for each node. So you can do this just for the leaf nodes. Again, you can probably speed it up a bit if you get it to stop once it's found all of the leaf nodes.

I think there is probably the potential to build a significantly faster algorithm by modifying the the bi-directional dijkstra algorithm to somehow deal with a lot of nodes However, this would be a significant bit of coding, and it would be subtle, and I'm not convinced it can work. I would need to spend a lot of time to figure out if it's doable and a real improvement. So for that you're on your own.


Second , you might be wondering, why is my current implementation just returning the tuple back to me?

I think you've run into a strange bug in the shortest path algorithms. I think it's interpreting your tuple as a node [tuples are allowed as nodes, but lists are not, and if I pass in the equivalent list, it breaks]. So it understands you want the shortest path from that node to itself, which is just itself. Because you've assigned a weight (even if you haven't used it), it's using the dijkstra algorithms, and as I look into the details, it looks like the dijkstra algorithm checks if source==target before it does a check if source is in the graph. So it's telling you that the shortest path from that tuple to itself is just the single node (the tuple), even though the tuple is not a node of the graph.

I've submitted a bug report

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM