[英]Finding the index of an element in a sorted list efficiently
I have a sorted list l
(of around 20,000 elements), and would like to find the first element in l
that exceeds a given value t_min. 我有一个排序的列表
l
(大约20,000个元素),并且想要查找l
中超过给定值t_min的第一个元素。 Currently, my code is as follows. 目前,我的代码如下。
def find_index(l):
first=next((t for t in l if t>t_min), None)
if first==None:
return None
else:
return l.index(first)
To benchmark the code, I used cProfile
to run a testing loop, and stripped out the time required to randomly generate lists by comparing the time to a control loop: 为了对代码进行基准测试,我使用了
cProfile
运行测试循环,并通过将时间与控制循环进行比较来去除随机生成列表所需的时间:
import numpy
import cProfile
def test_loop(n):
for _ in range(n):
test_l=sorted(numpy.random.random_sample(20000))
find_index(test_l, 0.5)
def control_loop(n):
for _ in range(n):
test_l=sorted(numpy.random.random_sample(20000))
# cProfile.run('test_loop(1000)') takes 10.810 seconds
# cProfile.run('control_loop(1000)') takes 9.650 seconds
Each function call for find_index
takes about 1.16 ms. find_index
每个函数调用find_index
需要1.16毫秒。 Is there a way to improve the code to make it more efficient, given that we know the list is sorted? 鉴于我们知道列表已排序,是否有一种方法可以改进代码以使其更有效率?
The standard library bisect
module is useful for this, and the docs contain an example of exactly this use case. 标准库
bisect
模块对此非常有用,并且文档中包含一个有关此用例的示例。
def find_gt(a, x):
'Find leftmost value greater than x'
i = bisect_right(a, x)
if i != len(a):
return a[i]
raise ValueError
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.