简体   繁体   中英

How to measure pairwise distances between two sets of points?

I have two datasets (csv files). Both of them contains latitudes-longitudes of two sets (220 and 4400) of points. Now I want to measure pairwise distances (miles) between these two sets of points (220 x 4400). How can I do that in python? Similar to this problem: https://gist.github.com/rochacbruno/2883505

一个数据集的例子

Best is to use sklearn which has exactly what you ask for.

Say we have some sample data

towns = pd.DataFrame({
    "name" : ["Merry Hill", "Spring Valley", "Nesconset"],
    "lat" : [36.01, 41.32, 40.84],
    "long" : [-76.7, -89.20, -73.15]
})

museum = pd.DataFrame({
    "name" : ["Motte Historical Car Museum, Menifee", "Crocker Art Museum, Sacramento", "World Chess Hall Of Fame, St.Louis", "National Atomic Testing Museum, Las", "National Air and Space Museum, Washington", "The Metropolitan Museum of Art", "Museum of the American Military Family & Learning Center"],
    "lat" : [33.743511, 38.576942, 38.644302, 36.114269, 38.887806, 40.778965, 35.083359],
    "long" : [-117.165161, -121.504997, -90.261154, -115.148315, -77.019844, -73.962311, -106.381531]
})

You can use sklearn distance metrics, which has the haversine implemented

from sklearn.neighbors import DistanceMetric

dist = DistanceMetric.get_metric('haversine')

After you extract the numpy array values with

places_gps = towns[["lat", "long"]].values
museum_gps = museum[["lat", "long"]].values

you simply

EARTH_RADIUS = 6371.009

haversine_distances = dist.pairwise(np.radians(places_gps), np.radians(museum_gps) )
haversine_distances *= EARTH_RADIUS

to get the distances in KM . If you need miles, multiply with constant.

If you are only interested in the closest few, or all within radius, check out sklearn BallTree algorithm which also has the haversine implemented. It is much faster.


Edit: To convert the output to a dataframe use for instance

pd_distances = pd.DataFrame(haversine_distances, columns=museum.name, index=towns.name, )
pd_distances

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM