简体   繁体   中英

Python get current index of list comprehension without using enumerate

I have the following data:

names = ['foo','bar','baz', 'spam', 'ham', 'jam']

indices =[0,2,3,4]

size = 3

and want to create a list of the names which index is in indices . The list must have the size specified in the variable size .

I could not achieve it by doing this (wrong length):

selected_names = []
selected_names = [names[i] for i in indices if len(selected_names) <= size]
# Out[5]: ['foo', 'baz', 'spam', 'ham']

and I don't like this solution because declaring the empty list at the beginning is not elegant.

I can do this:

selected_names = [names[i] for x,i in  enumerate(indices) if x <= size]

but that's a bit unreadable and the list length is still wrong.

Is there a correct and more beautiful way to create that list? maybe something like this?

#pseudo code
selected_names = [names[i] for i in indices if list_current_index < size]

enumerate wouldn't even solve this since it would cause you to stop when you'd pulled size elements, not when you'd kept size elements. The only reason it seems to work is that you use a test for <= size (which actually keeps size + 1 elements), and your indices happens to be one element larger than size . If indices was larger, or size smaller, your test wouldn't work as intended.

If the goal is to keep size elements, without processing more elements than needed, then the simplest approach (assuming you don't mind slicing to create a small intermediate list , which is usually okay) is just:

selected_names = [names[i] for i in indices[:size]]

If indices and size are huge, you can use itertools.islice with a generator expression to avoid the intermediate slice, using less memory, but somewhat more CPU:

import itertools

selected_names = [names[i] for i in itertools.islice(indices, size)]

The fastest option I can find, avoiding any explicit looping at all, is using the operator module , though it involves temporaries for argument passing, which is probably a bad idea if size is ever going to be huge (10s of thousands and up):

import operator

selected_names = operator.itemgetter(*indices[:size])(names)

This creates an itemgetter callable that will look up the first size elements from indices , then immediately calls it on names , returning a tuple of all the values (wrap the itemgetter call in list if you need a mutable list result instead of a tuple ). It also avoids all Python level loops in CPython; a loop still occurs at the C layer in CPython, but a loop at the C layer runs a lot faster than any loop at the Python layer. For simple ipython %timeit tests, the operator.itemgetter approach won, taking ~24% less time than slice + list comprehension (which in turn was about 9% faster than islice + list comprehension). For larger inputs (I just multiplied indices and size by 100), operator.itemgetter wins by a factor of 3x (slice still beats islice , but by a meaningless margin; the overhead in islice is mostly in setup, and doesn't increase meaningfully as the number of items sliced goes up).

All are equivalent to:

selected_names = [names[i] for i in indices][:size]

except they don't populate the complete list first, then cut it down to size ; they get enough entries and stop immediately.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM