简体   繁体   English

计算点之间的欧几里得距离时出现内存错误

[英]Memory Error when calculating Euclidean distance between points

I want to find the eps of DBSCAN. 我想找到DBSCAN的eps。 I have a set of points and need to calculate the distance from each point to each other point. 我有一组点,需要计算每个点到另一个点的距离。 Where an array of shape is (2267436, 2), then find the near and minpoint. 形状为(2267436,2)的数组,然后找到近点和最小点。 Here are my data: 这是我的数据:

xy= [[  177963.16728699  2506663.75713195]
 [  176147.50406716  2502422.34894945]
 [  178480.33178874  2507299.83467826]
 ..., 
 [  231205.88139267  2684014.30324774]
 [  231207.81085397  2684014.52219471]
 [  231214.870296    2684054.8263628 ]]

I am trying these methods like: 我正在尝试这些方法,例如:

dist = scipy.spatial.distance.cdist(xy, xy,'euclidean')

or 要么

np.sqrt((np.square(npxy[:,np.newaxis]-npxy).sum(axis=2)))

or 要么

dist=scipy.spatial.distance.pdist(npxy)
d_matrix = scipy.spatial.distance.squareform(dist)

I am getting MemoryError for all. 我正在为所有人获取MemoryError。 Is there any solution to figure out it? 有什么解决方案可以解决吗?

With some very easy math you can figure out that you cannot store all O(n²) distance in memory. 通过一些非常简单的数学运算,您可以确定无法将所有O(n²)距离存储在内存中。

If you compute only the distances of one point at a time, you will be fine. 如果一次只计算一个点的距离,那会很好。

Also, try to use an index to reduce the runtime from O(n²) to a manageable scale. 另外,尝试使用索引将运行时间从O(n²)减少到可管理的范围。

Or you use a more modern algorithm like OPTICS. 或者您使用更现代的算法,例如OPTICS。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM