简体繁体 English

包含x％点的最小边界球体

[英]Smallest Bounding Sphere containing x% of points

原文 2016-09-26 14:24:57 4 5 algorithm/ 3d/ language-agnostic/ geometry/ point-clouds

Given a 3D point cloud, how can I find the smallest bounding sphere that contains a given percentage of points? 给定3D点云，如何找到包含给定百分点的最小边界球？

Ie if I have a point cloud with some noise, and I want to ignore 5% of outliers, how can I get the smallest sphere that contains the 95% remaining points, if I do not know which points are the outliers? 即如果我有一个带有一些噪声的点云，并且我想忽略5％的异常值，如果我不知道哪些点是异常值，我怎么能得到包含95％剩余点的最小球体？

Example: I want to find the green sphere, not the red sphere: 示例：我想找到绿色球体，而不是红色球体：

I am looking for a reasonably fast and simple algorithm. 我正在寻找一个相当快速和简单的算法。 It does not have to find the optimal solution, a reasonable approximation is fine as well. 它不必找到最优解，合理的近似也很好。

I know how to calculate the approximate bounding sphere for 100% of points, eg with Ritter's algorithm. 我知道如何计算100％点的近似边界球，例如使用Ritter算法。

How can I generalize this to an algorithm that finds the smallest sphere containing x% of points? 如何将此推广到找到包含x％点的最小球体的算法？

5 个解决方案

Just an idea: binary search. 只是一个想法：二元搜索。

First, use one of the bounding sphere algorithms to find the 100% bounding sphere first. 首先，使用其中一个边界球算法首先找到100％的边界球。

Fix the centerpoint of the 95% sphere to be the same as the centerpoint of the 100% sphere. 将95％球体的中心点固定为与100％球体的中心点相同。 (There is no guarantee it is, but you say you're ok with approximate answer.) Then use binary search on the radius of the sphere until you get 95% +- epsilon points inside. （不能保证它是，但你说你可以用近似的答案。）然后在球体的半径上使用二分搜索，直到你得到95% +- epsilon点。

Assuming the points are sorted by their distance (or squared distance, to be slightly faster) from the centerpoint, for a fixed radius r it takes O(log n) operations to find the number of points inside the sphere with radius r , eg by using another binary search. 假设点的距离（或平方距离，稍微快一点）从中心点排序，对于固定半径r ，需要O(log n)操作来找到半径为r的球体内的点数，例如使用另一个二分搜索。 The binary search for the right r itself requires logarithmic number of such evaluation. 右r的二进制搜索本身需要对数的这种评估。 Therefore The whole search should take just O(log ² n) steps after you have found the 100% sphere. 因此，在找到100％球体后，整个搜索应该只需要O（log ² n）个步骤。

Edit: if you think the center of the reduced sphere could be too far away from the full sphere, you can recalculate the bounding sphere, or just the center of the mass of the point set, each time after throwing away some points. 编辑：如果您认为缩小球体的中心距离整个球体太远，则可以在每次丢弃一些点后重新计算边界球体，或仅重新计算点集的质量中心。 Each recaculation should take no more than O(n). 每次重新计算不应超过O（n）。 After recalculation, resort the points by their distance from the new centerpoint. 重新计算后，按距离新中心点的距离求助点。 Since you expect them to be already nearly sorted, you can rely on bubble sort, which for nearly-sorte data works in O(n + epsilon). 由于您希望它们已经几乎已经排序，您可以依赖冒泡排序，这对于近乎分类的数据在O（n + epsilon）中起作用。 Remember that there will be just a logarithmic number of these tests needed, so you should be able to get away with close to O(n log ² n) for the whole thing. 请记住，只需要这些测试的对数，所以你应该能够接近O（n log ² n）来完成所有测试。

It depends on what exactly performance you're looking for and what you're willing to sacrifice for that. 这取决于你正在寻找什么样的表现，以及你愿意为此牺牲什么。 (I would be happy to learn that I'm wrong and there's a good exact algortihm for this.) （我很高兴得知我错了，对此有一个很好的确切算法。）

The distance from the average point location would probably give a reasonable indication if a point is an outlier or not. 如果一个点是异常值，那么距平均点位置的距离可能会给出合理的指示。

The algorithm might look something like: 该算法可能类似于：

Find bounding sphere of points 找到点的边界球
Find average point location 找到平均点位置
Choose the point on the bounding sphere that is farthest from the average location, remove it as an outlier 选择距离平均位置最远的边界球上的点，将其作为异常值移除
Repeat steps 1-3 until you've removed 5% of points 重复步骤1-3，直到删除5％的分数

The algorithm of ryann is not that bad. ryann的算法并不坏。 I suggested robustifying with a geometric median then came to this sketch: 我建议使用几何中位数进行证明，然后来到这个草图：

compute the NxN inter-distances in O(N^2) 计算O（N ^ 2）中的NxN距离
sum each row of this matrix (= the distance of one point to the others) in O(N^2) 在O（N ^ 2）中对该矩阵的每一行（=一个点与其他点的距离）求和
sort the obtained "crowd" distance in O(N*log N) 在O（N * log N）中对获得的“人群”距离进行排序
(the point with smallest distance is an approximation of the geometric median) （距离最小的点是几何中值的近似值）
remove the 5% largest in O(1) 去除O中最大的5％（1）
here we just consider largest crowd-distance as outliers, 在这里我们只考虑最大的人群距离作为异常值，
instead of taking the largest distance from the median. 而不是从中位数采取最大距离。
compute radius of obtained sphere in O(N) 在O（N）中计算获得球体的半径

Of course, it also suffers from sub-optimality but should perform a bit better in case of far outlier. 当然，它也会受到次优性的影响，但是在远离异常的情况下应该表现得更好。 Overall cost is O(N^2). 总成本为O（N ^ 2）。

I would iterate the following two steps until convergence: 我会迭代以下两个步骤，直到收敛：

1) Given a group of points, find the smallest sphere enclosing 100% of the points and work out its centre. 1）给定一组点，找到包围100％点的最小球体并计算出其中心。

2) Given a centre, find the group of points containing 95% of the original number which is closest to the centre. 2）给定一个中心，找到包含最接近中心的原始数字的95％的点组。

Each step reduces (or at least does not increase) the radius of the sphere involved, so you can declare convergence when the radius stops decreasing. 每个步骤减少（或至少不增加）所涉及球体的半径，因此您可以在半径停止减小时声明收敛。

In fact, I would iterate from multiple random starts, each start produced by finding the smallest sphere that contains all of a small subset of points. 事实上，我将从多个随机开始迭代，每次开始通过找到包含所有一小部分点的最小球体而产生。 I note that if you have 10 outliers, and you divide your set of points into 11 parts, at least one of those parts will not have any outliers. 我注意到，如果你有10个异常值，并且你将你的点集分成11个部分，那么这些部分中至少有一个不会有任何异常值。

(This is very loosely based on https://en.wikipedia.org/wiki/Random_sample_consensus ) （这非常基于https://en.wikipedia.org/wiki/Random_sample_consensus ）

Find the Euclidean minimum spanning tree, and check the edges in descending order of length. 找到欧几里得最小生成树，并按长度的降序检查边缘。 For each edge, consider the sets of points points in the two connected trees you get by deleting the edge. 对于每条边，考虑通过删除边获得的两个连接树中的点集。

If the smaller set of points is less that 5% of the total, and the bounding sphere around the larger set of points doesn't overlap it, then delete the smaller set of points. 如果较小的点集小于总数的5％，并且较大的点集周围的边界球不与其重叠，则删除较小的点集。 (This condition is necessary in case you have an 'oasis' of empty space in the center of your point cloud). （如果您的点云中心有一个空间“绿洲”），则必须满足此条件。

Repeat this until you hit your threshold or the lengths are getting 'small enough' that you don't care to delete them. 重复此操作，直到达到阈值或长度变得“足够小”而不必删除它们。