简体   繁体   中英

create a python Tree from a list of strings

Im using the newest version of python on a windows10 machine. I have a list of 100k routes that semi trucks take. Each route only has 1 single stop. Each route originates at cityA and travels to cityB. it is structured similarly to this:

list_single_stop_routes = [
'cityA to cityB',
'ohio to cali',
'penn to texas',
'cali to tenn',
'tenn to ohio']

What i want to do, is create a list of 'extended routes' where each extended route has some arbitrary number of stops that goes from city-to-city.

I started by getting a list of every cityA from each route, and using it as my originating location. then i took the corresponding cityB, then iterated through each single-stop route where my current cityB is the new routes cityA. So lets say my originating location is ohio, then cityB is indiana. now i want to iterate through each route to find every route where indiana is the originating location (cityA), and i might find a route that says 'indiana to texas'. So my structure so far will say 'ohio to indiana to texas' and so-on until there are (potentially) no more connections to be made.

I tried creating dictionaries and lists to help me structure the output, but i cant seem to figure out exactly what will work. Please keep in mind that it is a requirement to preserve the correct ordering of every single route. I then started to consider some sort of data structure, maybe similar to a tree? Ultimately i want a list like so:

list_extended_routes = [
    "ohio to cali to tenn to texas to flor to nevada to wisconsin to newyork",
    "missou to texas to wisconsin to texas to ohio",]

Hopefully someone can help lead me in the right direction? Thank you!

This is inherently a graph problem, so best might be to use a graph library such as networkx .

The graph is the following:

在此处输入图像描述

You can construct it using:

list_single_stop_routes = [
'cityA to cityB',
'ohio to cali',
'penn to texas',
'cali to tenn',
'tenn to ohio',
'ohio to indiana'
]

import networkx as nx

G = nx.from_edgelist(s.split(' to ') for s in list_single_stop_routes)

Then it's easy to find all the routes (path) between 2 cities (nodes):

list(nx.all_simple_paths(G, source='indiana', target='tenn'))

output:

[['indiana', 'ohio', 'cali', 'tenn'], ['indiana', 'ohio', 'tenn']]

directed graph (=specific direction)

if you want a directed graph, use

G = nx.from_edgelist((s.split(' to ') for s in list_single_stop_routes),
                     create_using=nx.DiGraph)

在此处输入图像描述

Now there is no route from indiana to tenn

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM