[英]iterate a value of a row of a dataframe with each value of a column in another
I m trying to loop each row of df1 with every row of df2 and create a new col in df1 and store the min(all values) in it. 我试图用df2的每一行循环df1的每一行,并在df1中创建一个新的col并将min(所有值)存储在其中。
lat_sc= shopping_centers['lat']
long_sc= shopping_centers['lng']
for i, j in zip(lat_sc,long_sc):
for lat_real, long_real in zip(real_estate['lat'],real_estate['lng']):
euclid_dist.append( lat_real - i)
short_dist.append(min(euclid_dist))
euclid_dist = []
Result: df1['shortest'] = min(df1['lat']- each lat of df2
) 结果:df1 ['shortest'] = min(df1 ['lat']- each lat of df2
)
df1['nearest sc'] = that corresponding sc_id df1 ['nearest sc'] =对应的sc_id
Edit to include sc_id in df1 编辑以在df1中包含sc_id
This could get computationally intensive as df2 gets big but you can find the difference the df1 distance and all the df2 distances like this (it's possible to do this more efficiently) 随着df2变大,这可能需要大量的计算,但是您可以找到df1距离与所有df2距离之差(可以更有效地执行此操作)
def find_euclid_dist(row):
dist_arr = np.sqrt((ref_lats - row["lat"])**2 + (ref_longs - row["lng"])**2)
return np.min(dist_arr)
ref_lats = df2["lat"].values
ref_longs = df2["lng"].values
df1["shortest"] = df1.apply(find_euclid_dist, axis=1)
How abut using cdist from scipy ? 如何从scipy使用cdist ?
from scipy.spatial.distance import cdist
df1['shortest'] = cdist(df1[['lat','lng']], df2[['lat','lng']], metric='euclidean').min(1)
print(df1)
returns: print(df1)
返回:
lat lng addr_street shortest
0 -37.980523 -37.980523 37 Scarlet Drive 183.022436
1 -37.776161 -37.776161 999 Heidelberg Road 182.817951
2 -37.926238 -37.926238 47 New Street 182.968096
3 -37.800056 -37.800056 3/113 Normanby Road 182.841849
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.