I have two df as shown below.
df1 = {'Aberdeen Tunnel':[22.2620666,114.1779123]
, 'Lion Rock Tunnel':[22.35134,114.1753917]
, 'Shing Mun Tunnels':[22.3773149,114.1513125]
, 'Tseung Kwan O Tunnel':[22.3191321,114.2440963]
, 'Tsing Sha Highway':[22.343242,114.141755]
, 'Cross Harbour Tunnel':[22.2922422,114.1796539]
, 'Eastern Harbour Crossing':[22.2951813,114.220724]
, 'Western Harbour Crossing':[22.2973088,114.1508622]
, 'Tate\'s Cairn Tunnel':[22.3588556,114.2079283]
, 'Tai Lam Tunnel':[22.3917362,114.0598441]}
df1 = pd.DataFrame(data = df1, index = ['lat','lon'])
df1 = pd.DataFrame.transpose(df1)
print(df1)
df2 = {(22.250559,114.170959),(22.281769,114.180153),(22.336325,114.178978)}
df2 = pd.DataFrame(data = df2, index = ['lat','lon'])
df2 = pd.DataFrame.transpose(df2)
print(df2)
I want to construct a for loop so as to find out from df2, which "Tunnel" is the nearest to the respective coordinates.
I have tried the below to first calculate the respective distance, but it doesn't seem to produce the right output.
for i in df1:
for j in df1:
for h in df2:
for k in df2:
dist = math.hypot(i-h , j-k)
print (dist)
In case your two data frames are not tooo large, you can use a cross join:
dat1 = {'Aberdeen Tunnel':[22.2620666,114.1779123]
, 'Lion Rock Tunnel':[22.35134,114.1753917]
, 'Shing Mun Tunnels':[22.3773149,114.1513125]
, 'Tseung Kwan O Tunnel':[22.3191321,114.2440963]
, 'Tsing Sha Highway':[22.343242,114.141755]
, 'Cross Harbour Tunnel':[22.2922422,114.1796539]
, 'Eastern Harbour Crossing':[22.2951813,114.220724]
, 'Western Harbour Crossing':[22.2973088,114.1508622]
, 'Tate\'s Cairn Tunnel':[22.3588556,114.2079283]
, 'Tai Lam Tunnel':[22.3917362,114.0598441]}
df1 = pd.DataFrame(data = dat1, index = ['lat','lon'])
df1 = pd.DataFrame.transpose(df1)
df1['coord'] = list(zip(df1.lat, df1.lon))
df1 = df1[["coord"]]
# use tunnel as a column
df1['tunnel'] = df1.index
df2 = pd.DataFrame(columns=["coord"], index=[])
dat2 = [(22.250559,114.170959),
(22.281769,114.180153),
(22.336325,114.178978)]
df2["coord"] = dat2
# there must be some sort of identification for each coordinate
df2["id"] = ["a", "b", "c"]
df2
# Cross join between both data frames
df_merge = df1.merge(df2, how="cross")
# calculate the distance between each pair of coordinates
df_merge["distance"] = df_merge.apply(lambda row: distance.distance(row["coord_x"], row["coord_y"]), axis=1)
# find the minimum distance for each point
minimiums = df_merge.groupby("id").distance.transform("min")
# return the tunnel - id pair with the minimum distance for each id
df_merge.loc[minimiums == df_merge["distance"], ["id", "tunnel", "distance"]]
# id tunnel distance
# 0 a Aberdeen Tunnel 1.4620087109472515 km
# 5 c Lion Rock Tunnel 1.7032322759652727 km
# 16 b Cross Harbour Tunnel 1.1608810339311775 km
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.