![](/img/trans.png)
[英]Find coordinate of the closest point on polygon in Shapely or GeoPandas
[英]Pandas Dataframe: Find the column with the closest coordinate point to another columns coordinate point
我正在處理足球和足球運動員的跟蹤數據。 我正在嘗試為每一行坐標點找到最接近球的球員,並創建一個新列,將最接近球的球員歸因於
示例數據
| ball_point | home_player1_point | home_player2_point | away_player1_point |
| -------- | -------------- | ----------------------------------
| (7.00,3.00) (-15.37,8.22) (25.3,-.2) (12.0,12.9)
所需 output
| ball_point | home_player1_point | home_player2_point | away_player1_point | closest
| -------- | -------------- | ----------------------------------
| (7.00,3.00) (-15.37,8.22) (25.3,-.2) (7.1,3.2) away_player1
這是我的工作筆記本的鏈接: https://github.com/piercepatrick/Articles_EDA/blob/main/nashSCProject.ipynb與此問題有關的工作可以在底部找到,盡管現在很亂。 我也用這個問題來幫助我: Find nearest point in Pandas DataFrames
任何幫助表示贊賞,我需要在今晚之前完成!
我假設您的 dataframe 有更多行。 First you need to define some functions: a function of distance between two points (I'll use euclidean distance) and a function to get the distance between point in two pandas.Series
(or dataframe columns):
def euc_dist(x,y):
return ((x[0] - y[0])**2 +(x[1] - y[1])**2 )**(1/2)
def dist(s1,s2):
distances = [euc_dist(s1[i],s2[i]) for i in range(s1.shape[0])]
return pd.Series(distances)
dist
的返回值必須是pandas.Series
因為它必須是一個新列(我假設您的 dataframe 稱為df
):
distances_df = df.iloc[:,1:].apply(dist, args = (df["ball_point"],))
df["closest"] = distances_df.idxmin(axis = 1).apply(lambda x: str(x)[:-6])
function dist
從第二列開始應用,這就是為什么我使用df.iloc[:,1:]
並且它們都與“ball_position”列進行比較,這就是為什么它在args
參數中,它必須是tuple
。
然后,您可以使用DataFrame.idxmin
找到距離最小的列。 lambda function 只是在示例中獲取"away_player1"
而不是"away_player1_point"
。
打印distances_df
和df
給出:
#distances_df
home_player1_point home_player2_point away_player1_point
0 22.970966 18.577675 11.090987
#df
ball_point home_player1_point home_player2_point away_player1_point closest
0 (7, 3) (-15.37, 8.22) (25.3, -0.2) (12.0, 12.9) away_player1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.