简体   繁体   中英

Python list.index() versus dictionary

I have a list of about 50 strings. I will repeatedly (potentially tens of thousands of times) need to know the position of items in the list. Is it better to use list.index() each time, or create a dictionary mapping each item to its position? (My instinct says to create the dictionary, but I don't know what underlies the list indexing, and it may be superfluous.)

list.index() will traverse the list until it finds the item it's looking for, which is a linear-time operation. Looking a string up in a dictionary, by contrast, is a constant-time operation, so the dictionary approach will likely have better performance.

Since your keys are strings and you have relatively few of them, another data structure you might want to explore is the trie .

Use a dictionary mapping as opposed to looking up an item in a list. The dictionary mapping uses the hash of each item before it evaluates. The hash comparison is much faster and can be found much more quickly (in constant time) as opposed to seeking through the list and evaluating item by item (which scales in linear time).

You can profile your lookups like this:

import timeit
setup = 'from __main__ import foo_dict, foo_list'

To restrict the comparison for a list only 50 long:

l = list(str(i) for i in range(50))
d = dict((str(i), i) for i in range(50))
def foo_dict(k):
    return d[k]

def foo_list(k):
    return l.index(k)

timeit.repeat('[foo_dict(str(i)) for i in range(50)]', setup)

returns for me:

[20.89474606513977, 23.206938982009888, 22.23725199699402]

and

timeit.repeat('[foo_list(str(i)) for i in range(50)]', setup)

returns:

[47.33547496795654, 47.995683908462524, 46.79590392112732]

The dict lookup is much faster for the string because it uses a hash table, whereas the list lookup for the index is much slower because it has to evaluate each string in it against the string being looked for.

The dictionary will be much faster, and it's very fast to create, too:

indexer = dict((v, i) for i, v in enumerate(thelist))

enumerate yields (i, thelist[i]) for i in range(len(thelist)) , whence the generator expression "swapping" the tuple (as you need to map content to index, not vice versa).

Note that this will work only if every list item is hashable, but since you say the items are strings, you should be fine.

dict , among other things, rapidly turns an iterable of (key, value) tuples into the corresponding dictionary.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM