简体   繁体   中英

Finding the index of an element in a sorted list efficiently

I have a sorted list l (of around 20,000 elements), and would like to find the first element in l that exceeds a given value t_min. Currently, my code is as follows.

def find_index(l):   
    first=next((t for t in l if t>t_min), None) 
    if first==None:
        return None
    else:
        return l.index(first)

To benchmark the code, I used cProfile to run a testing loop, and stripped out the time required to randomly generate lists by comparing the time to a control loop:

import numpy
import cProfile

def test_loop(n):
    for _ in range(n):
        test_l=sorted(numpy.random.random_sample(20000))
        find_index(test_l, 0.5)

def control_loop(n):
    for _ in range(n):
        test_l=sorted(numpy.random.random_sample(20000))

# cProfile.run('test_loop(1000)') takes 10.810 seconds
# cProfile.run('control_loop(1000)') takes 9.650 seconds

Each function call for find_index takes about 1.16 ms. Is there a way to improve the code to make it more efficient, given that we know the list is sorted?

The standard library bisect module is useful for this, and the docs contain an example of exactly this use case.

def find_gt(a, x):
    'Find leftmost value greater than x'
    i = bisect_right(a, x)
    if i != len(a):
        return a[i]
    raise ValueError

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM