简体   繁体   中英

Python: average distance between a bunch of points in the (x,y) plane

The formula for computing the distance between two points in the (x, y) plane is fairly known and straightforward .

However, what is the best way to approach a problem with n points, for which you want to compute the average distance?

Example:

import matplotlib.pyplot as plt
x=[89.86, 23.0, 9.29, 55.47, 4.5, 59.0, 1.65, 56.2, 18.53, 40.0]
y=[78.65, 28.0, 63.43, 66.47, 68.0, 69.5, 86.26, 84.2, 88.0, 111.0]
plt.scatter(x, y,color='k')
plt.show()

在此输入图像描述

The distance is simply rendered as:

import math
dist=math.sqrt((x2-x1)**2+(y2-y1)**2)

But this is a problem of combinations with repetitions that are not allowed. How to approach it?

itertools.combinations gives combinations without repeats:

>>> for combo in itertools.combinations([(1,1), (2,2), (3,3), (4,4)], 2):
...     print(combo)
...
((1, 1), (2, 2))
((1, 1), (3, 3))
((1, 1), (4, 4))
((2, 2), (3, 3))
((2, 2), (4, 4))
((3, 3), (4, 4))

Code for your problem:

import math
from itertools import combinations

def dist(p1, p2):
    (x1, y1), (x2, y2) = p1, p2
    return math.sqrt((x2 - x1)**2 + (y2 - y1)**2)

x = [89.86, 23.0, 9.29, 55.47, 4.5, 59.0, 1.65, 56.2, 18.53, 40.0]
y = [78.65, 28.0, 63.43, 66.47, 68.0, 69.5, 86.26, 84.2, 88.0, 111.0]

points = list(zip(x,y))
distances = [dist(p1, p2) for p1, p2 in combinations(points, 2)]
avg_distance = sum(distances) / len(distances)

In that case you need to loop over the sequence of points:

from math import sqrt

def avg_distance(x,y):
    n = len(x)
    dist = 0
    for i in range(n):
        xi = x[i]
        yi = y[i]
        for j in range(i+1,n):
            dx = x[j]-xi
            dy = y[j]-yi
            dist += sqrt(dx*dx+dy*dy)
    return 2.0*dist/(n*(n-1))

In the last step, we divide the total distance by n×(n-1)/2 which is the result of:

n-1
---
\       n (n-1)
/   i = -------
---        2
i=1

which is thus the total amount of distances we have calculated.

Here we do not measure the distance between a point and itself (which is of course always 0). Note that this of course has impact on the average since you do not count them as well.

Given there are n points, this algorithm runs in O(n 2 ) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM