[英]How to efficiently find the bounding box of a collection of points?
I have several points stored in an array. 我有几个点存储在一个数组中。 I need to find bounds of that points ie.
我需要找到那些点的界限,即。 the rectangle which bounds all the points.
限定所有点的矩形。 I know how to solve this in plain Python.
我知道如何用普通的Python解决这个问题。
I would like to know is there a better way than the naive max, min over the array or built-in method to solve the problem. 我想知道有没有比天真的最大,最小的数组或内置方法更好的方法来解决问题。
points = [[1, 3], [2, 4], [4, 1], [3, 3], [1, 6]]
b = bounds(points) # the function I am looking for
# now b = [[1, 1], [4, 6]]
My approach to getting performance is to push things down to C level whenever possible: 我获得性能的方法是尽可能将事情降低到C级:
def bounding_box(points):
x_coordinates, y_coordinates = zip(*points)
return [(min(x_coordinates), min(y_coordinates)), (max(x_coordinates), max(y_coordinates))]
By my (crude) measure, this runs about 1.5 times faster than @ReblochonMasque's bounding_box_naive()
. 通过我的(粗略)测量,它比@ ReblochonMasque的
bounding_box_naive()
运行快约1.5倍。 And is clearly more elegant. 而且显然更优雅。 ;-)
;-)
You cannot do better than O(n)
, because you must traverse all the points to determine the max
and min
for x
and y
. 你不能比
O(n)
做得更好,因为你必须遍历所有的点来确定x
和y
的max
和min
。
But, you can reduce the constant factor, and traverse the list only once; 但是,您可以减少常数因子,并且只遍历列表一次; however, it is unclear if that would give you a better execution time, and if it does, it would be for large collections of points.
然而,目前还不清楚这是否会给你一个更好的执行时间,如果确实如此,那将是大量积分。
[EDIT]: in fact it does not, the "naive" approach is the most efficient.
[编辑]:事实上它没有,“天真”的方法是最有效的。
def bounding_box_naive(points):
"""returns a list containing the bottom left and the top right
points in the sequence
Here, we use min and max four times over the collection of points
"""
bot_left_x = min(point[0] for point in points)
bot_left_y = min(point[1] for point in points)
top_right_x = max(point[0] for point in points)
top_right_y = max(point[1] for point in points)
return [(bot_left_x, bot_left_y), (top_right_x, top_right_y)]
def bounding_box(points):
"""returns a list containing the bottom left and the top right
points in the sequence
Here, we traverse the collection of points only once,
to find the min and max for x and y
"""
bot_left_x, bot_left_y = float('inf'), float('inf')
top_right_x, top_right_y = float('-inf'), float('-inf')
for x, y in points:
bot_left_x = min(bot_left_x, x)
bot_left_y = min(bot_left_y, y)
top_right_x = max(top_right_x, x)
top_right_y = max(top_right_y, y)
return [(bot_left_x, bot_left_y), (top_right_x, top_right_y)]
import random
points = [(random.randrange(-1000, 1000), random.randrange(-1000, 1000)) for _ in range(1000000)]
%timeit bounding_box_naive(points)
%timeit bounding_box(points)
1000 loops, best of 3: 573 µs per loop
1000 loops, best of 3: 1.46 ms per loop
100 loops, best of 3: 5.7 ms per loop
100 loops, best of 3: 14.7 ms per loop
10 loops, best of 3: 66.8 ms per loop
10 loops, best of 3: 141 ms per loop
1 loop, best of 3: 664 ms per loop
1 loop, best of 3: 1.47 s per loop
Clearly, the first "not so naive" approach is faster by a factor 2.5 - 3
显然,第一个“不太天真”的方法更快
2.5 - 3
倍
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.