Find overlaps between many LineStrings

Question

I have a long list of shapely.geometry.LineString s which are partially overlapping (They are driving routes on the German highway network to be precise). Each LineString has an associated value (in my case delay and number of cars that took that route). My goal is to find the overlaps between these different LineString s and sum their corresponding values for that part. The final result should look something like what was visualized below:

My current methodology goes roughly as follows (code at the end of the post):

Maintain a list of already checked routes
For each new route: Check if there are overlaps between the new route and the already checked routes
If yes, assign the combined values to the overlapping part and continue the check only with the remaining non-overlapping part

This approach works and produces the desired result. The problem is however that with a very large number of routes to check, the runtime of the algorithm becomes excessively long. The reason for that is that there are progressively more LineStrings where the intersection has to be checked. Does anybody know of either a preexisting algorithm, package, etc. that could be used to address this issue? Or do you have good suggestions on how to accelerate the algorithm?

In code, the implementation is as below (admittedly not prettyfied yet):

def intersects_rectangles(bounds, other_bounds):
    return not (bounds[2] < other_bounds[0] or bounds[0] > other_bounds[2] or bounds[3] < other_bounds[1] or bounds[1] > other_bounds[3])

def main_function(route_df) -> Dict[tuple, List[Union[LineString, float]]]:
    # The index of route_df contains ids that lead to files that contain the LineStrings
    # The columns contain the relevant values associated with each route
    table_merged: Dict[tuple, List[Union[LineString, float]]] = {}
    routes_covered = 0
    tot_routes = len(route_df)
    print(f'Creating a merged view of all routes. Nr. routes: {tot_routes}')
    for (route_name, car_name), line in route_df.iterrows():
        with open('buffer_data/' + route_name + '/' + BufferFileHierarchy.route.value, 'rb') as f:
            linestring: LineString
            linestring, _ = pickle.load(f)
        current_cum_delay = line['sum']
        current_count = line['count']
        if len(table_merged) == 0:
            table_merged[(route_name, )] = [linestring, linestring.buffer(0.001).buffer(0), current_cum_delay, current_count] # The buffer is necessary to also include LineStrings describing the opposite side of the road
            continue
        for other_contained_routes in list(table_merged.keys()):
            [other_linestring, other_linestring_buffer, other_cum_delay, other_cum_count] = table_merged[other_contained_routes]
            if route_name in other_contained_routes:
                continue
            seg_intersects = linestring.intersection(other_linestring_buffer)
            if seg_intersects.is_empty:
                continue
            if not intersects_rectangles(linestring.bounds, other_linestring.bounds):
                continue
            if seg_intersects.length > 0.0002:
                seg_intersects_buffer = seg_intersects.buffer(0.001).buffer(0)
                linestring_remainder = linestring.difference(seg_intersects_buffer)
                other_linestring_remainder = other_linestring.difference(seg_intersects_buffer)
                table_merged[(*other_contained_routes, route_name)] = [seg_intersects, seg_intersects_buffer, current_cum_delay + other_cum_delay, current_count + other_cum_count]
                if other_linestring_remainder.is_empty:
                    del table_merged[other_contained_routes]
                else:
                    table_merged[other_contained_routes] = [other_linestring_remainder, other_linestring_remainder.buffer(0.001).buffer(0), other_cum_delay, other_cum_count]
                linestring = linestring_remainder
                if linestring.is_empty:
                    break
        if not linestring.is_empty:
            table_merged[(route_name, )] = [linestring, linestring.buffer(0.001).buffer(0), current_cum_delay, current_count]
        routes_covered += 1
        if routes_covered % 100 == 0:
            progress_bar.print_progress_bar(routes_covered, tot_routes)
    progress_bar.print_progress_bar(routes_covered, tot_routes)
    return table_merged

Answer 1

I had the same problem.

I don't know if you can speed up the algorithm. But you can obtain the German highway system as a graph with osm2po. With that you can do the same job way faster by counting the intersections of each route with each graph edge.

For example:

import geopandas as gpd

graph.geometry= graph.geometry.buffer(0.001) # Add some leeway for intersections
graph['nCars'] = gpd.sjoin(graph, routes, op='intersects').value_counts('edge_id')

graph is a geopandas dataframe filled with the edges of the highway.

routes is a geopandas dataframe with all the driven routes.

Find overlaps between many LineStrings

Question

1 answers

solution1
0 2022-02-02 16:54:10

Find overlaps between many LineStrings

Question

1 answers

solution1 0 2022-02-02 16:54:10

solution1
0 2022-02-02 16:54:10