[英]Nearest neighbor join with distance condition
In this question I'm reffering to this project:在这个问题中,我指的是这个项目:
https://automating-gis-processes.github.io/site/master/notebooks/L3/nearest-neighbor-faster.html
https://automating-gis-processes.github.io/site/master/notebooks/L3/nearest-neighbor-faster.html
We have two GeoDataFrames:我们有两个 GeoDataFrame:
Buildings:建筑物:
name geometry
0 None POINT (24.85584 60.20727)
1 Uimastadion POINT (24.93045 60.18882)
2 None POINT (24.95113 60.16994)
3 Hartwall Arena POINT (24.92918 60.20570)
and Bus stops:和巴士站:
stop_name stop_lat stop_lon stop_id geometry
0 Ritarihuone 60.169460 24.956670 1010102 POINT (24.95667 60.16946)
1 Kirkkokatu 60.171270 24.956570 1010103 POINT (24.95657 60.17127)
2 Kirkkokatu 60.170293 24.956721 1010104 POINT (24.95672 60.17029)
3 Vironkatu 60.172580 24.956554 1010105 POINT (24.95655 60.17258)
After applying申请后
sklearn.neighbors import BallTree
sklearn.neighbors 导入 BallTree
from sklearn.neighbors import BallTree
import numpy as np
def get_nearest(src_points, candidates, k_neighbors=1):
"""Find nearest neighbors for all source points from a set of candidate points"""
# Create tree from the candidate points
tree = BallTree(candidates, leaf_size=15, metric='haversine')
# Find closest points and distances
distances, indices = tree.query(src_points, k=k_neighbors)
# Transpose to get distances and indices into arrays
distances = distances.transpose()
indices = indices.transpose()
# Get closest indices and distances (i.e. array at index 0)
# note: for the second closest points, you would take index 1, etc.
closest = indices[0]
closest_dist = distances[0]
# Return indices and distances
return (closest, closest_dist)
def nearest_neighbor(left_gdf, right_gdf, return_dist=False):
"""
For each point in left_gdf, find closest point in right GeoDataFrame and return them.
NOTICE: Assumes that the input Points are in WGS84 projection (lat/lon).
"""
left_geom_col = left_gdf.geometry.name
right_geom_col = right_gdf.geometry.name
# Ensure that index in right gdf is formed of sequential numbers
right = right_gdf.copy().reset_index(drop=True)
# Parse coordinates from points and insert them into a numpy array as RADIANS
left_radians = np.array(left_gdf[left_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())
right_radians = np.array(right[right_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())
# Find the nearest points
# -----------------------
# closest ==> index in right_gdf that corresponds to the closest point
# dist ==> distance between the nearest neighbors (in meters)
closest, dist = get_nearest(src_points=left_radians, candidates=right_radians)
# Return points from right GeoDataFrame that are closest to points in left GeoDataFrame
closest_points = right.loc[closest]
# Ensure that the index corresponds the one in left_gdf
closest_points = closest_points.reset_index(drop=True)
# Add distance if requested
if return_dist:
# Convert to meters from radians
earth_radius = 6371000 # meters
closest_points['distance'] = dist * earth_radius
return closest_points
closest_stops = nearest_neighbor(buildings, stops, return_dist=True)
We get for each building index the distance to the closest bus stop:我们为每个建筑物索引获取到最近的公共汽车站的距离:
stop_name stop_lat stop_lon stop_id geometry distance
0 Muusantori 60.207490 24.857450 1304138 POINT (24.85745 60.20749) 180.521584
1 Eläintarha 60.192490 24.930840 1171120 POINT (24.93084 60.19249) 372.665221
2 Senaatintori 60.169010 24.950460 1020450 POINT (24.95046 60.16901) 119.425777
3 Veturitie 60.206610 24.929680 1174112 POINT (24.92968 60.20661) 106.762619
I'm looking for solution to get every bus stop (can be more than one) in distance below 250 meters for each building.我正在寻找解决方案,让每个建筑物的每个公共汽车站(可以超过一个)距离低于 250 米。
Thank you for help.谢谢你的帮助。
here is a way reusing what has been done with BallTree like in question but with query_radius
instead.这是一种重用 BallTree 所做的事情的方法,但使用
query_radius
代替。 Also it is not in function format but you can still change it easily它也不是 function 格式,但您仍然可以轻松更改它
from sklearn.neighbors import BallTree
import numpy as np
import pandas as pd
## here I start with buildings and stops as loaded in the link provided
# variable in meter you can change
radius_max = 250 # meters
# another parameter, in case you want to do with Mars radius ^^
earth_radius = 6371000 # meters
# similar to the method with apply in the tutorial
# to create left_radians and right_radians, but faster
candidates = np.vstack([stops['geometry'].x.to_numpy(),
stops['geometry'].y.to_numpy()]).T*np.pi/180
src_points = np.vstack([buildings['geometry'].x.to_numpy(),
buildings['geometry'].y.to_numpy()]).T*np.pi/180
# Create tree from the candidate points
tree = BallTree(candidates, leaf_size=15, metric='haversine')
# use query_radius instead
ind_radius, dist_radius = tree.query_radius(src_points,
r=radius_max/earth_radius,
return_distance=True)
Now you can manipulate the results to get what you want现在您可以操纵结果以获得您想要的结果
# create a dataframe build with
# index based on row position of the building in buildings
# column row_stop is the row position of the stop
# dist is the distance
closest_dist = pd.concat([pd.Series(ind_radius).explode().rename('row_stop'),
pd.Series(dist_radius).explode().rename('dist')*earth_radius],
axis=1)
print (closest_dist.head())
# row_stop dist
#0 1131 180.522
#1 NaN NaN
#2 64 174.744
#2 61 119.426
#3 532 106.763
# merge the dataframe created above with the original data stops
# to get names, id, ... note: the index must be reset as in closest_dist
# it is position based
closest_stop = closest_dist.merge(stops.reset_index(drop=True),
left_on='row_stop', right_index=True, how='left')
print (closest_stop.head())
# row_stop dist stop_name stop_lat stop_lon stop_id \
#0 1131 180.522 Muusantori 60.20749 24.85745 1304138.0
#1 NaN NaN NaN NaN NaN NaN
#2 64 174.744 Senaatintori 60.16896 24.94983 1020455.0
#2 61 119.426 Senaatintori 60.16901 24.95046 1020450.0
#3 532 106.763 Veturitie 60.20661 24.92968 1174112.0
#
# geometry
#0 POINT (24.85745 60.20749)
#1 None
#2 POINT (24.94983 60.16896)
#2 POINT (24.95046 60.16901)
#3 POINT (24.92968 60.20661)
Finally join back to buildings最后加入建筑物
# join buildings with reset_index with
# closest_stop as index in closest_stop are position based
final_df = buildings.reset_index(drop=True).join(closest_stop, rsuffix='_stop')
print (final_df.head(10))
# name geometry row_stop dist stop_name \
# 0 None POINT (24.85584 60.20727) 1131 180.522 Muusantori
# 1 Uimastadion POINT (24.93045 60.18882) NaN NaN NaN
# 2 None POINT (24.95113 60.16994) 64 174.744 Senaatintori
# 2 None POINT (24.95113 60.16994) 61 119.426 Senaatintori
# 3 Hartwall Arena POINT (24.92918 60.20570) 532 106.763 Veturitie
# stop_lat stop_lon stop_id geometry_stop
# 0 60.20749 24.85745 1304138.0 POINT (24.85745 60.20749)
# 1 NaN NaN NaN None
# 2 60.16896 24.94983 1020455.0 POINT (24.94983 60.16896)
# 2 60.16901 24.95046 1020450.0 POINT (24.95046 60.16901)
# 3 60.20661 24.92968 1174112.0 POINT (24.92968 60.20661)
To get the closest bus stop within 250m:要获得 250 米以内最近的巴士站:
filtered_by_distance = closest_stops[closest_stops.distance < 250]
result = buildings.join(filtered_by_distance)
For all stops within the radius you need to use BallTree.query_radius .对于半径内的所有站点,您需要使用BallTree.query_radius 。 But you will need to convert meters to radians.
但是您需要将米转换为弧度。
In this question I'm reffering to this project:在这个问题中,我指的是这个项目:
https://automating-gis-processes.github.io/site/master/notebooks/L3/nearest-neighbor-faster.html
https://automating-gis-processes.github.io/site/master/notebooks/L3/nearest-neighbor-faster.ZFC35FDC70D5FC69D269883A822C7A53
We have two GeoDataFrames:我们有两个 GeoDataFrame:
Buildings:建筑物:
name geometry
0 None POINT (24.85584 60.20727)
1 Uimastadion POINT (24.93045 60.18882)
2 None POINT (24.95113 60.16994)
3 Hartwall Arena POINT (24.92918 60.20570)
and Bus stops:和巴士站:
stop_name stop_lat stop_lon stop_id geometry
0 Ritarihuone 60.169460 24.956670 1010102 POINT (24.95667 60.16946)
1 Kirkkokatu 60.171270 24.956570 1010103 POINT (24.95657 60.17127)
2 Kirkkokatu 60.170293 24.956721 1010104 POINT (24.95672 60.17029)
3 Vironkatu 60.172580 24.956554 1010105 POINT (24.95655 60.17258)
After applying申请后
sklearn.neighbors import BallTree
sklearn.neighbors 导入 BallTree
from sklearn.neighbors import BallTree
import numpy as np
def get_nearest(src_points, candidates, k_neighbors=1):
"""Find nearest neighbors for all source points from a set of candidate points"""
# Create tree from the candidate points
tree = BallTree(candidates, leaf_size=15, metric='haversine')
# Find closest points and distances
distances, indices = tree.query(src_points, k=k_neighbors)
# Transpose to get distances and indices into arrays
distances = distances.transpose()
indices = indices.transpose()
# Get closest indices and distances (i.e. array at index 0)
# note: for the second closest points, you would take index 1, etc.
closest = indices[0]
closest_dist = distances[0]
# Return indices and distances
return (closest, closest_dist)
def nearest_neighbor(left_gdf, right_gdf, return_dist=False):
"""
For each point in left_gdf, find closest point in right GeoDataFrame and return them.
NOTICE: Assumes that the input Points are in WGS84 projection (lat/lon).
"""
left_geom_col = left_gdf.geometry.name
right_geom_col = right_gdf.geometry.name
# Ensure that index in right gdf is formed of sequential numbers
right = right_gdf.copy().reset_index(drop=True)
# Parse coordinates from points and insert them into a numpy array as RADIANS
left_radians = np.array(left_gdf[left_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())
right_radians = np.array(right[right_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())
# Find the nearest points
# -----------------------
# closest ==> index in right_gdf that corresponds to the closest point
# dist ==> distance between the nearest neighbors (in meters)
closest, dist = get_nearest(src_points=left_radians, candidates=right_radians)
# Return points from right GeoDataFrame that are closest to points in left GeoDataFrame
closest_points = right.loc[closest]
# Ensure that the index corresponds the one in left_gdf
closest_points = closest_points.reset_index(drop=True)
# Add distance if requested
if return_dist:
# Convert to meters from radians
earth_radius = 6371000 # meters
closest_points['distance'] = dist * earth_radius
return closest_points
closest_stops = nearest_neighbor(buildings, stops, return_dist=True)
We get for each building index the distance to the closest bus stop:我们为每个建筑物索引获取到最近的公共汽车站的距离:
stop_name stop_lat stop_lon stop_id geometry distance
0 Muusantori 60.207490 24.857450 1304138 POINT (24.85745 60.20749) 180.521584
1 Eläintarha 60.192490 24.930840 1171120 POINT (24.93084 60.19249) 372.665221
2 Senaatintori 60.169010 24.950460 1020450 POINT (24.95046 60.16901) 119.425777
3 Veturitie 60.206610 24.929680 1174112 POINT (24.92968 60.20661) 106.762619
I'm looking for solution to get every bus stop (can be more than one) in distance below 250 meters for each building.我正在寻找解决方案,以使每座建筑物的每个公交车站(可能不止一个)的距离都低于 250 米。
Thank you for help.谢谢你的帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.