[英]Pandas: Calculate Distance and Angle between X, Y groupped
Well, I have following columns:好吧,我有以下列:
Id PlayId X Y
0 0 2.3 3.4
1 0 5.4 3.2
2 1 3.2 5.1
3 1 4.2 1.7
If I have two rows groupped by one PlayId, I want to add two columns of Distance and Angle:如果我有两行按一个 PlayId 分组,我想添加两列距离和角度:
Id PlayId X Y Distance_0 Distance_1 Angle_0 Angle_1
0 0 2.3 3.4 0.0 ? 0.0 ?
1 0 5.4 3.2 ? 0.0 ? 0.0
2 1 3.2 5.1
3 1 4.2 1.7
Every Distance-column describes Euclidean distance between i-th and j-th element in a group:每个距离列都描述了组中第 i 个和第 j 个元素之间的欧几里得距离:
dist(x0, x1, y0, y1) = sqrt((x0 - x1) ** 2 + (y0 - y1) ** 2)
Similar way, the angle between i-th and j-th element is calculated.类似地,计算第 i 个和第 j 个元素之间的角度。
So, how can I perform this efficiently, without processing elements one-by-one?那么,如何在不逐个处理元素的情况下有效地执行此操作?
You can compute the pairwise distances by using the pdist
function from SciPy:您可以使用来自 SciPy 的
pdist
function 计算成对距离:
df = pd.DataFrame({'X': [5, 6, 7], 'Y': [3, 4, 5]})
# df
# X Y
# 0 5 3
# 1 6 4
# 2 7 5
from scipy.spatial.distance import pdist, squareform
cols = [f'Distance_{i}' for i in range(len(df))]
pd.DataFrame(squareform(pdist(df.values)), columns=cols)
which produces the following DataFrame:产生以下 DataFrame:
Distance_0 Distance_1 Distance_2
0 0.000000 1.638991 2.828427
1 1.638991 0.000000 1.638991
2 2.828427 1.638991 0.000000
This works, since pdist
takes an array of size m * n, where m is the number of observations (=rows) and n the dimension of said observations (in this case: two - X and Y)这是可行的,因为
pdist
采用大小为 m * n 的数组,其中 m 是观察的数量(=行),n 是所述观察的维度(在这种情况下:两个 - X 和 Y)
You could subsequently concat the original DataFrame with the newly created one if needed (using pd.concat
).如果需要,您可以随后将原始 DataFrame 与新创建的连接(使用
pd.concat
)。
For the angle, you could use pdist
as well, using metric='cosine'
to compute the cosine distance.对于角度,您也可以使用
pdist
,使用metric='cosine'
来计算余弦距离。 See this post for more information.有关更多信息,请参阅此帖子。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.