[英]Average closest coordinates in Python
This is the continuation of my previous question. 这是我上一个问题的延续。 I have now a sorted list of coordinates in a Euclidean space.
我现在有一个欧几里德空间中的坐标排序列表。 I want to average the closest coordinates in such a way that clustering works ie a whole cluster is averaged and returns one single point in Euclidean space.
我希望以这样的方式平均最近的坐标,即聚类起作用,即整个聚类被平均并返回欧几里德空间中的单个点。 So, for example the list below
所以,例如下面的列表
a = [[ 42, 206],[ 45, 40],[ 45, 205],[ 46, 41],[ 46, 205],[ 47, 40],[ 47, 202],[ 48, 40],[ 48, 202],[ 49, 38]]
will return avg_coordinates = [[47.0, 39.8], [45.6, 204.0]]
. 将返回
avg_coordinates = [[47.0, 39.8], [45.6, 204.0]]
。 This is done by averaging first 5 closest points (or cluster) and then last 5 closest points. 这是通过平均前5个最近点(或簇)然后最后5个最近点来完成的。 Right now I am using gradient approach that is I am looping through all coordinates and wherever the gradient is higher then some set threshold then I consider it another cluster of points (because list is already sorted).
现在我正在使用渐变方法,即我循环遍历所有坐标,并且在梯度高于某个设置阈值的任何地方,然后我认为它是另一个点集群(因为列表已经排序)。 But problem arise when I have higher denominator then numerator in the gradient formula
gradient = (y2-y1)/(x2-x1)
which return a smaller value then threshold. 但是当我在梯度公式
gradient = (y2-y1)/(x2-x1)
中具有更高的分母然后分子时出现问题,其返回小于阈值的值。 So logically I am doing it wrong. 所以逻辑上我做错了。 Any good suggestions for doing this ?
这样做有什么好的建议吗? Please note I do not want to apply clustering.
请注意我不想应用群集。
Here's an approach - 这是一种方法 -
thresh = 100 # Threshold for splitting, heuristically chosen for given sample
# Lex-sort of coordinates
b = a[np.lexsort(a.T)]
# Interval indices that partition the clusters
diff_idx = np.flatnonzero(np.linalg.norm(b[1:] - b[:-1],axis=1) > thresh)+1
idx = np.hstack((0, diff_idx, b.shape[0]))
sums = np.add.reduceat(b, idx[:-1])
counts = idx[1:] - idx[:-1]
out = sums/counts.astype(float)[:,None]
Sample input, output - 样本输入,输出 -
In [141]: a
Out[141]:
array([[ 42, 206],
[ 45, 40],
[ 45, 205],
[ 46, 41],
[ 46, 205],
[ 47, 40],
[ 47, 202],
[ 48, 40],
[ 48, 202],
[ 49, 38]])
In [142]: out
Out[142]:
array([[ 47. , 39.8],
[ 45.6, 204. ]])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.