在 3D 空间中以最小最近邻距离和最大密度随机采样给定点

Question

I have n points in a 3D space.我在 3D 空间中有n个点。 I want to stochastically sample a subset of points with all nearest-neighbor distances larger than r .我想随机采样所有最近邻距离大于r的点的子集。 The size of the subset m is unknown, but I want the sampled points to be as dense as possible, ie maximize m .子集m的大小是未知的，但我希望采样点尽可能密集，即最大化m 。

There are similar questions, but they are all about generating points, rather than sampling from given points.有类似的问题，但它们都是关于生成点，而不是从给定点采样。
Generate random points in 3D space with minimum nearest-neighbor distance 在 3D 空间中生成具有最小最近邻距离的随机点

Generate 3-d random points with minimum distance between each of them? 生成每个点之间距离最小的 3-d 随机点？

Say I have 300 random 3D points,假设我有 300 个随机 3D 点，

import numpy as np
n = 300
points = np.random.uniform(0, 10, size=(n, 3))

What is the fastest way to get a subset of m points with minimum nearest-neighbor distance r = 1 while maximizing m ?在最大化m的同时获得具有最小最近邻距离r = 1的m点子集的最快方法是什么？

Answer 1

There's probably an efficient bicriteria approximation scheme, but why bother when integer programming is so quick on average?可能有一个有效的双准则近似方案，但是当 integer 编程平均速度如此之快时，为什么还要麻烦呢？

import numpy as np

n = 300
points = np.random.uniform(0, 10, size=(n, 3))

from ortools.linear_solver import pywraplp

solver = pywraplp.Solver.CreateSolver("SCIP")
variables = [solver.BoolVar("x[{}]".format(i)) for i in range(n)]
solver.Maximize(sum(variables))
for j, q in enumerate(points):
    for i, p in enumerate(points[:j]):
        if np.linalg.norm(p - q) <= 1:
            solver.Add(variables[i] + variables[j] <= 1)
solver.EnableOutput()
solver.Solve()
print(len([i for (i, variable) in enumerate(variables) if variable.SolutionValue()]))

Answer 2

This isn't optimally large of a subset, but should be close while not taking very long, using KDTree to optimize the distance calculations:这不是一个子集的最佳大小，但应该很接近，但不会花费很长时间，使用KDTree来优化距离计算：

from scipy.spatial import KDTree
import numpy as np

def space_sample(n = 300, low = 0, high = 10, dist = 1):
    points = np.random.uniform(low, high, size=(n, 3))
    k = KDTree(points)
    pairs = np.array(list(k.query_pairs(dist)))
    
    def reduce_pairs(pairs, remove = []):  #iteratively remove the most connected node
        p = pairs[~np.isin(pairs, remove).any(1)]
        if p.size >0:
            count = np.bincount(p.flatten(), minlength = n)
            r = remove + [count.argmax()]
            return reduce_pairs(p, r)
        else:
            return remove
    
    return np.array([p for i, p in enumerate(points) if not(i in reduce_pairs(pairs))])

subset = space_sample()

Iteratively removing the most connected node is not optimal (keeps about 200 of the 300 points), but is likely the fastest algorithm that's close to optimal (the only thing faster being just removing at random).迭代删除连接最多的节点不是最优的（保留 300 个点中的大约 200 个），但可能是接近最优的最快算法（唯一更快的是随机删除）。 You could possibly @njit reduce_pairs to make that part faster (will try if I have time later).您可以@njit reduce_pairs使该部分更快（如果我稍后有时间会尝试）。

Answer 3

Testing @David Eisenstat's answer with 30 given points:用 30 个给定点测试@David Eisenstat 的答案：

from ortools.linear_solver import pywraplp
import numpy as np

def subset_David_Eisenstat(points, r):
    solver = pywraplp.Solver.CreateSolver("SCIP")
    variables = [solver.BoolVar("x[{}]".format(i)) for i in range(len(points))]
    solver.Maximize(sum(variables))
    for j, q in enumerate(points):
        for i, p in enumerate(points[:j]):
            if np.linalg.norm(p - q) <= r:
                solver.Add(variables[i] + variables[j] <= 1)
    solver.EnableOutput()
    solver.Solve()
    indices = [i for (i, variable) in enumerate(variables) if variable.SolutionValue()]
    return points[indices]

points = np.array(
[[ 7.32837882, 12.12765786, 15.01412241],
 [ 8.25164031, 11.14830379, 15.01412241],
 [ 8.21790113, 13.05647987, 13.05647987],
 [ 7.30031002, 13.08276009, 14.05452502],
 [ 9.18056467, 12.0800735 , 13.05183844],
 [ 9.17929647, 11.11270337, 14.03027534],
 [ 7.64737905, 11.48906945, 15.34274827],
 [ 7.01315886, 12.77870699, 14.70301668],
 [ 8.88132414, 10.81243313, 14.68685022],
 [ 7.60617372, 13.39792166, 13.39792166],
 [ 8.85967682, 12.72946394, 12.72946394],
 [ 9.50060705, 11.43361294, 13.37866092],
 [ 8.21790113, 12.08471494, 14.02824481],
 [ 7.32837882, 12.12765786, 16.98587759],
 [ 8.25164031, 11.14830379, 16.98587759],
 [ 7.30031002, 13.08276009, 17.94547498],
 [ 8.21790113, 13.05647987, 18.94352013],
 [ 9.17929647, 11.11270337, 17.96972466],
 [ 9.18056467, 12.0800735 , 18.94816156],
 [ 7.64737905, 11.48906945, 16.65725173],
 [ 7.01315886, 12.77870699, 17.29698332],
 [ 8.88132414, 10.81243313, 17.31314978],
 [ 7.60617372, 13.39792166, 18.60207834],
 [ 8.85967682, 12.72946394, 19.27053606],
 [ 9.50060705, 11.43361294, 18.62133908],
 [ 8.21790113, 12.08471494, 17.97175519],
 [ 7.32837882, 15.01412241, 12.12765786],
 [ 8.25164031, 15.01412241, 11.14830379],
 [ 7.30031002, 14.05452502, 13.08276009],
 [ 9.18056467, 13.05183844, 12.0800735 ],])

When the expected minimum distance is 1:当期望的最小距离为 1 时：

subset1 = subset_David_Eisenstat(points, r=1.)
print(len(subset1))
# Output: 18

Check the minimum distance:检查最小距离：

from scipy.spatial.distance import cdist
dist = cdist(subset1, subset1, metric='euclidean')
# Delete diagonal
res = dist[~np.eye(dist.shape[0],dtype=bool)].reshape(dist.shape[0],-1)
print(np.min(res))
# Output: 1.3285513450926985

Change the expected minimum distance to 2:将预期的最小距离更改为 2：

subset2 = subset_David_Eisenstat(points, r=2.)
print(len(subset2))
# Output: 10

Check the minimum distance:检查最小距离：

from scipy.spatial.distance import cdist
dist = cdist(subset2, subset2, metric='euclidean')
# Delete diagonal
res = dist[~np.eye(dist.shape[0],dtype=bool)].reshape(dist.shape[0],-1)
print(np.min(res))
# Output: 2.0612041004376223

Answer 4

This might not be too fast, but iterate the 3D distance formula, append to dictionary, sort it, then get id.这可能不会太快，但将 3D 距离公式、append 迭代到字典，对其进行排序，然后获取 id。

The 3D distance formula is given points (x, y, z) and (x1, y1, z1) : 3D 距离公式给出点(x, y, z)和(x1, y1, z1) ：

m = ((x-x1)^2 + (y-y1)^2)^0.5
d = (m^2 + (z-z1)^2)^0.5

( d is the total distance and ^0.5 is sqrt.) （ d是总距离， ^0.5是 sqrt。）

在 3D 空间中以最小最近邻距离和最大密度随机采样给定点

问题描述

4 个解决方案

解决方案1
2 已采纳 2021-01-14 01:00:31

解决方案2
1 2021-01-18 09:16:03

解决方案3
0 2021-01-19 00:07:18

解决方案4
-1 2021-01-10 21:36:00

在 3D 空间中以最小最近邻距离和最大密度随机采样给定点

问题描述

4 个解决方案

解决方案1 2 已采纳 2021-01-14 01:00:31

解决方案2 1 2021-01-18 09:16:03

解决方案3 0 2021-01-19 00:07:18

解决方案4 -1 2021-01-10 21:36:00

解决方案1
2 已采纳 2021-01-14 01:00:31

解决方案2
1 2021-01-18 09:16:03

解决方案3
0 2021-01-19 00:07:18

解决方案4
-1 2021-01-10 21:36:00