I m trying to loop each row of df1 with every row of df2 and create a new col in df1 and store the min(all values) in it.
lat_sc= shopping_centers['lat']
long_sc= shopping_centers['lng']
for i, j in zip(lat_sc,long_sc):
for lat_real, long_real in zip(real_estate['lat'],real_estate['lng']):
euclid_dist.append( lat_real - i)
short_dist.append(min(euclid_dist))
euclid_dist = []
Result: df1['shortest'] = min(df1['lat']- each lat of df2
)
df1['nearest sc'] = that corresponding sc_id
Edit to include sc_id in df1
This could get computationally intensive as df2 gets big but you can find the difference the df1 distance and all the df2 distances like this (it's possible to do this more efficiently)
def find_euclid_dist(row):
dist_arr = np.sqrt((ref_lats - row["lat"])**2 + (ref_longs - row["lng"])**2)
return np.min(dist_arr)
ref_lats = df2["lat"].values
ref_longs = df2["lng"].values
df1["shortest"] = df1.apply(find_euclid_dist, axis=1)
How abut using cdist from scipy ?
from scipy.spatial.distance import cdist
df1['shortest'] = cdist(df1[['lat','lng']], df2[['lat','lng']], metric='euclidean').min(1)
print(df1)
returns:
lat lng addr_street shortest
0 -37.980523 -37.980523 37 Scarlet Drive 183.022436
1 -37.776161 -37.776161 999 Heidelberg Road 182.817951
2 -37.926238 -37.926238 47 New Street 182.968096
3 -37.800056 -37.800056 3/113 Normanby Road 182.841849
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.