简体   繁体   English


[英]Nearest neighbor join with distance condition

In this question I'm reffering to this project:在这个问题中,我指的是这个项目:

https://automating-gis-processes.github.io/site/master/notebooks/L3/nearest-neighbor-faster.html https://automating-gis-processes.github.io/site/master/notebooks/L3/nearest-neighbor-faster.html

We have two GeoDataFrames:我们有两个 GeoDataFrame:


             name                   geometry
0            None  POINT (24.85584 60.20727)
1     Uimastadion  POINT (24.93045 60.18882)
2            None  POINT (24.95113 60.16994)
3  Hartwall Arena  POINT (24.92918 60.20570)

and Bus stops:和巴士站:

     stop_name   stop_lat   stop_lon  stop_id                   geometry
0  Ritarihuone  60.169460  24.956670  1010102  POINT (24.95667 60.16946)
1   Kirkkokatu  60.171270  24.956570  1010103  POINT (24.95657 60.17127)
2   Kirkkokatu  60.170293  24.956721  1010104  POINT (24.95672 60.17029)
3    Vironkatu  60.172580  24.956554  1010105  POINT (24.95655 60.17258)

After applying申请后

sklearn.neighbors import BallTree sklearn.neighbors 导入 BallTree

from sklearn.neighbors import BallTree
import numpy as np

def get_nearest(src_points, candidates, k_neighbors=1):
    """Find nearest neighbors for all source points from a set of candidate points"""

    # Create tree from the candidate points
    tree = BallTree(candidates, leaf_size=15, metric='haversine')

    # Find closest points and distances
    distances, indices = tree.query(src_points, k=k_neighbors)

    # Transpose to get distances and indices into arrays
    distances = distances.transpose()
    indices = indices.transpose()

    # Get closest indices and distances (i.e. array at index 0)
    # note: for the second closest points, you would take index 1, etc.
    closest = indices[0]
    closest_dist = distances[0]

    # Return indices and distances
    return (closest, closest_dist)

def nearest_neighbor(left_gdf, right_gdf, return_dist=False):
    For each point in left_gdf, find closest point in right GeoDataFrame and return them.

    NOTICE: Assumes that the input Points are in WGS84 projection (lat/lon).

    left_geom_col = left_gdf.geometry.name
    right_geom_col = right_gdf.geometry.name

    # Ensure that index in right gdf is formed of sequential numbers
    right = right_gdf.copy().reset_index(drop=True)

    # Parse coordinates from points and insert them into a numpy array as RADIANS
    left_radians = np.array(left_gdf[left_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())
    right_radians = np.array(right[right_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())

    # Find the nearest points
    # -----------------------
    # closest ==> index in right_gdf that corresponds to the closest point
    # dist ==> distance between the nearest neighbors (in meters)

    closest, dist = get_nearest(src_points=left_radians, candidates=right_radians)

    # Return points from right GeoDataFrame that are closest to points in left GeoDataFrame
    closest_points = right.loc[closest]

    # Ensure that the index corresponds the one in left_gdf
    closest_points = closest_points.reset_index(drop=True)

    # Add distance if requested
    if return_dist:
        # Convert to meters from radians
        earth_radius = 6371000  # meters
        closest_points['distance'] = dist * earth_radius

            return closest_points

closest_stops = nearest_neighbor(buildings, stops, return_dist=True)

We get for each building index the distance to the closest bus stop:我们为每个建筑物索引获取到最近的公共汽车站的距离:

    stop_name    stop_lat   stop_lon    stop_id                 geometry      distance
0   Muusantori   60.207490  24.857450   1304138 POINT (24.85745 60.20749)   180.521584
1   Eläintarha   60.192490  24.930840   1171120 POINT (24.93084 60.19249)   372.665221
2   Senaatintori 60.169010  24.950460   1020450 POINT (24.95046 60.16901)   119.425777
3   Veturitie    60.206610  24.929680   1174112 POINT (24.92968 60.20661)   106.762619

I'm looking for solution to get every bus stop (can be more than one) in distance below 250 meters for each building.我正在寻找解决方案,让每个建筑物的每个公共汽车站(可以超过一个)距离低于 250 米。

Thank you for help.谢谢你的帮助。

here is a way reusing what has been done with BallTree like in question but with query_radius instead.这是一种重用 BallTree 所做的事情的方法,但使用query_radius代替。 Also it is not in function format but you can still change it easily它也不是 function 格式,但您仍然可以轻松更改它

from sklearn.neighbors import BallTree
import numpy as np
import pandas as pd
## here I start with buildings and stops as loaded in the link provided

# variable in meter you can change
radius_max = 250 # meters
# another parameter, in case you want to do with Mars radius ^^
earth_radius = 6371000  # meters

# similar to the method with apply in the tutorial 
# to create left_radians and right_radians, but faster
candidates = np.vstack([stops['geometry'].x.to_numpy(), 
src_points = np.vstack([buildings['geometry'].x.to_numpy(), 

# Create tree from the candidate points
tree = BallTree(candidates, leaf_size=15, metric='haversine')
# use query_radius instead
ind_radius, dist_radius = tree.query_radius(src_points, 

Now you can manipulate the results to get what you want现在您可以操纵结果以获得您想要的结果

# create a dataframe build with
# index based on row position of the building in buildings
# column row_stop is the row position of the stop
# dist is the distance
closest_dist = pd.concat([pd.Series(ind_radius).explode().rename('row_stop'), 
print (closest_dist.head())
#  row_stop     dist
#0     1131  180.522
#1      NaN      NaN
#2       64  174.744
#2       61  119.426
#3      532  106.763

# merge the dataframe created above with the original data stops
# to get names, id, ... note: the index must be reset as in closest_dist
# it is position based
closest_stop = closest_dist.merge(stops.reset_index(drop=True), 
                                  left_on='row_stop', right_index=True, how='left')
print (closest_stop.head())
#  row_stop     dist     stop_name  stop_lat  stop_lon    stop_id  \
#0     1131  180.522    Muusantori  60.20749  24.85745  1304138.0   
#1      NaN      NaN           NaN       NaN       NaN        NaN   
#2       64  174.744  Senaatintori  60.16896  24.94983  1020455.0   
#2       61  119.426  Senaatintori  60.16901  24.95046  1020450.0   
#3      532  106.763     Veturitie  60.20661  24.92968  1174112.0   
#                    geometry  
#0  POINT (24.85745 60.20749)  
#1                       None  
#2  POINT (24.94983 60.16896)  
#2  POINT (24.95046 60.16901)  
#3  POINT (24.92968 60.20661) 

Finally join back to buildings最后加入建筑物

# join buildings with reset_index with 
# closest_stop as index in closest_stop are position based
final_df = buildings.reset_index(drop=True).join(closest_stop, rsuffix='_stop')
print (final_df.head(10))
#              name                   geometry row_stop     dist     stop_name  \
# 0            None  POINT (24.85584 60.20727)     1131  180.522    Muusantori   
# 1     Uimastadion  POINT (24.93045 60.18882)      NaN      NaN           NaN   
# 2            None  POINT (24.95113 60.16994)       64  174.744  Senaatintori   
# 2            None  POINT (24.95113 60.16994)       61  119.426  Senaatintori   
# 3  Hartwall Arena  POINT (24.92918 60.20570)      532  106.763     Veturitie   

#    stop_lat  stop_lon    stop_id              geometry_stop  
# 0  60.20749  24.85745  1304138.0  POINT (24.85745 60.20749)  
# 1       NaN       NaN        NaN                       None  
# 2  60.16896  24.94983  1020455.0  POINT (24.94983 60.16896)  
# 2  60.16901  24.95046  1020450.0  POINT (24.95046 60.16901)  
# 3  60.20661  24.92968  1174112.0  POINT (24.92968 60.20661)  

To get the closest bus stop within 250m:要获得 250 米以内最近的巴士站:

filtered_by_distance = closest_stops[closest_stops.distance < 250]
result = buildings.join(filtered_by_distance)

For all stops within the radius you need to use BallTree.query_radius .对于半径内的所有站点,您需要使用BallTree.query_radius But you will need to convert meters to radians.但是您需要将米转换为弧度。

In this question I'm reffering to this project:在这个问题中,我指的是这个项目:

https://automating-gis-processes.github.io/site/master/notebooks/L3/nearest-neighbor-faster.html https://automating-gis-processes.github.io/site/master/notebooks/L3/nearest-neighbor-faster.ZFC35FDC70D5FC69D269883A822C7A53

We have two GeoDataFrames:我们有两个 GeoDataFrame:


             name                   geometry
0            None  POINT (24.85584 60.20727)
1     Uimastadion  POINT (24.93045 60.18882)
2            None  POINT (24.95113 60.16994)
3  Hartwall Arena  POINT (24.92918 60.20570)

and Bus stops:和巴士站:

     stop_name   stop_lat   stop_lon  stop_id                   geometry
0  Ritarihuone  60.169460  24.956670  1010102  POINT (24.95667 60.16946)
1   Kirkkokatu  60.171270  24.956570  1010103  POINT (24.95657 60.17127)
2   Kirkkokatu  60.170293  24.956721  1010104  POINT (24.95672 60.17029)
3    Vironkatu  60.172580  24.956554  1010105  POINT (24.95655 60.17258)

After applying申请后

sklearn.neighbors import BallTree sklearn.neighbors 导入 BallTree

from sklearn.neighbors import BallTree
import numpy as np

def get_nearest(src_points, candidates, k_neighbors=1):
    """Find nearest neighbors for all source points from a set of candidate points"""

    # Create tree from the candidate points
    tree = BallTree(candidates, leaf_size=15, metric='haversine')

    # Find closest points and distances
    distances, indices = tree.query(src_points, k=k_neighbors)

    # Transpose to get distances and indices into arrays
    distances = distances.transpose()
    indices = indices.transpose()

    # Get closest indices and distances (i.e. array at index 0)
    # note: for the second closest points, you would take index 1, etc.
    closest = indices[0]
    closest_dist = distances[0]

    # Return indices and distances
    return (closest, closest_dist)

def nearest_neighbor(left_gdf, right_gdf, return_dist=False):
    For each point in left_gdf, find closest point in right GeoDataFrame and return them.

    NOTICE: Assumes that the input Points are in WGS84 projection (lat/lon).

    left_geom_col = left_gdf.geometry.name
    right_geom_col = right_gdf.geometry.name

    # Ensure that index in right gdf is formed of sequential numbers
    right = right_gdf.copy().reset_index(drop=True)

    # Parse coordinates from points and insert them into a numpy array as RADIANS
    left_radians = np.array(left_gdf[left_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())
    right_radians = np.array(right[right_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())

    # Find the nearest points
    # -----------------------
    # closest ==> index in right_gdf that corresponds to the closest point
    # dist ==> distance between the nearest neighbors (in meters)

    closest, dist = get_nearest(src_points=left_radians, candidates=right_radians)

    # Return points from right GeoDataFrame that are closest to points in left GeoDataFrame
    closest_points = right.loc[closest]

    # Ensure that the index corresponds the one in left_gdf
    closest_points = closest_points.reset_index(drop=True)

    # Add distance if requested
    if return_dist:
        # Convert to meters from radians
        earth_radius = 6371000  # meters
        closest_points['distance'] = dist * earth_radius

            return closest_points

closest_stops = nearest_neighbor(buildings, stops, return_dist=True)

We get for each building index the distance to the closest bus stop:我们为每个建筑物索引获取到最近的公共汽车站的距离:

    stop_name    stop_lat   stop_lon    stop_id                 geometry      distance
0   Muusantori   60.207490  24.857450   1304138 POINT (24.85745 60.20749)   180.521584
1   Eläintarha   60.192490  24.930840   1171120 POINT (24.93084 60.19249)   372.665221
2   Senaatintori 60.169010  24.950460   1020450 POINT (24.95046 60.16901)   119.425777
3   Veturitie    60.206610  24.929680   1174112 POINT (24.92968 60.20661)   106.762619

I'm looking for solution to get every bus stop (can be more than one) in distance below 250 meters for each building.我正在寻找解决方案,以使每座建筑物的每个公交车站(可能不止一个)的距离都低于 250 米。

Thank you for help.谢谢你的帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM