I have a long list of shapely.geometry.LineString
s which are partially overlapping (They are driving routes on the German highway network to be precise). Each LineString has an associated value (in my case delay and number of cars that took that route). My goal is to find the overlaps between these different LineString
s and sum their corresponding values for that part. The final result should look something like what was visualized below:
My current methodology goes roughly as follows (code at the end of the post):
This approach works and produces the desired result. The problem is however that with a very large number of routes to check, the runtime of the algorithm becomes excessively long. The reason for that is that there are progressively more LineStrings where the intersection has to be checked. Does anybody know of either a preexisting algorithm, package, etc. that could be used to address this issue? Or do you have good suggestions on how to accelerate the algorithm?
In code, the implementation is as below (admittedly not prettyfied yet):
def intersects_rectangles(bounds, other_bounds):
return not (bounds[2] < other_bounds[0] or bounds[0] > other_bounds[2] or bounds[3] < other_bounds[1] or bounds[1] > other_bounds[3])
def main_function(route_df) -> Dict[tuple, List[Union[LineString, float]]]:
# The index of route_df contains ids that lead to files that contain the LineStrings
# The columns contain the relevant values associated with each route
table_merged: Dict[tuple, List[Union[LineString, float]]] = {}
routes_covered = 0
tot_routes = len(route_df)
print(f'Creating a merged view of all routes. Nr. routes: {tot_routes}')
for (route_name, car_name), line in route_df.iterrows():
with open('buffer_data/' + route_name + '/' + BufferFileHierarchy.route.value, 'rb') as f:
linestring: LineString
linestring, _ = pickle.load(f)
current_cum_delay = line['sum']
current_count = line['count']
if len(table_merged) == 0:
table_merged[(route_name, )] = [linestring, linestring.buffer(0.001).buffer(0), current_cum_delay, current_count] # The buffer is necessary to also include LineStrings describing the opposite side of the road
continue
for other_contained_routes in list(table_merged.keys()):
[other_linestring, other_linestring_buffer, other_cum_delay, other_cum_count] = table_merged[other_contained_routes]
if route_name in other_contained_routes:
continue
seg_intersects = linestring.intersection(other_linestring_buffer)
if seg_intersects.is_empty:
continue
if not intersects_rectangles(linestring.bounds, other_linestring.bounds):
continue
if seg_intersects.length > 0.0002:
seg_intersects_buffer = seg_intersects.buffer(0.001).buffer(0)
linestring_remainder = linestring.difference(seg_intersects_buffer)
other_linestring_remainder = other_linestring.difference(seg_intersects_buffer)
table_merged[(*other_contained_routes, route_name)] = [seg_intersects, seg_intersects_buffer, current_cum_delay + other_cum_delay, current_count + other_cum_count]
if other_linestring_remainder.is_empty:
del table_merged[other_contained_routes]
else:
table_merged[other_contained_routes] = [other_linestring_remainder, other_linestring_remainder.buffer(0.001).buffer(0), other_cum_delay, other_cum_count]
linestring = linestring_remainder
if linestring.is_empty:
break
if not linestring.is_empty:
table_merged[(route_name, )] = [linestring, linestring.buffer(0.001).buffer(0), current_cum_delay, current_count]
routes_covered += 1
if routes_covered % 100 == 0:
progress_bar.print_progress_bar(routes_covered, tot_routes)
progress_bar.print_progress_bar(routes_covered, tot_routes)
return table_merged
I had the same problem.
I don't know if you can speed up the algorithm. But you can obtain the German highway system as a graph with osm2po. With that you can do the same job way faster by counting the intersections of each route with each graph edge.
For example:
import geopandas as gpd
graph.geometry= graph.geometry.buffer(0.001) # Add some leeway for intersections
graph['nCars'] = gpd.sjoin(graph, routes, op='intersects').value_counts('edge_id')
graph is a geopandas dataframe filled with the edges of the highway.
routes is a geopandas dataframe with all the driven routes.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.