简体   繁体   中英

How efficient is list.index(value, start, end)?

Today I realized that python's list.index can also take an optional start (and even end ) parameter.

I was wondering whether or not this is efficiently implemented and which of these two is better:

pattern = "qwertyuytresdftyuioknn"
words_list = ['queen', 'quoin']
for word in words_list:
    i = 1
    for character in word:
        try:
            i += pattern[i:].index(character)
        except ValueError:
            break
        else:
            yield word

or

pattern = "qwertyuytresdftyuioknn"
words_list = ['queen', 'quoin']
for word in words_list:
    i = 1
    for character in word:
        try:
            i = pattern.index(character, i)
        except ValueError:
            break
        else:
            yield word

So basically i += pattern[i:].index(character) vs i = pattern.index(character, i) .

Searching for this on generic_search_machine returns nothing helpful, except a lot of beginner tutorials trying to teach me what a list is.

Background: This code tries to find all words from words_list which match pattern . pattern is a list of characters a user entered by swiping over the keyboard, like on most modern mobile device's keyboards.

In the actual implementation there is the additional requirement that the returned word should be longer than 5 characters and the first and last character have to exactly match. These lines are omitted here for brevity, since they are trivial to implement.

This calls a built-in function implemented in C:

i = pattern.index(character, i)

Even without looking at the source code , you can always assume that the underlying implementation is smart enough to implement that efficiently, ie that it does not look at the first i values in the list.

As a rule of thumb, using a built-in functionality is always faster than (or at least as fast as) the best thing you can implement yourself.

The attempt to make it better:

i += pattern[i:].index(character)

This is deffinitely worse. It makes a copy of pattern[i:] and then looks for character in it.

So, in the worst case, if you have a pattern of 1 GB and i=1 , this copies 1 GB of data in memory in attempt to skip the first element (which whould have been skipped anyway).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM