简体   繁体   中英

How does Python perform during a list comprehension?

def thing(mode, data):
    return [
        item for item in data
        if {
            'large': lambda item: item > 100,
            'small': lambda item: item < 100,
        }[mode](item)
    ]

This list comprehension generates a dictionary of lambdas, retrieves one lambda via the mode argument and applies it to the list item currently being processed. My question is this: What are the performance characteristics of this?

Is the entire dictionary created from scratch during each iteration of the listcomp? Or is it created once and used on each item?

Is the entire dictionary created from scratch during each iteration of the listcomp

Yes.

Or is it created once and used on each item?

No.


Fortunately, in this case (and all others like it that I can think of), it's easy to just build the dict ahead of time:

 d = {
        'large': lambda item: item > 100,
        'small': lambda item: item < 100,
      }
 return [item for item in data if d[mode](item)]

Or even,

func = {
        'large': lambda item: item > 100,
        'small': lambda item: item < 100,
        }[mode]
return [item for item in data if func(item)]

I'm pretty sure that this will cause the entire dictionary to be created from scratch for each element of the list. The basic grammar for python list comprehensions is as follows,

[ E1 for ID in E2 if E3 ] ,

where E1, E2, and E3 are expressions. E2 is evaluated once, when the the interpreter starts to evaluate the list comprehension. E1 and E3 are evaluated for each member of the collection that E2 evaluates to. So, yes. In your question, the dictionary is constructed from scratch each time but you can easily fix that by declaring the dictionary before the list comprehension.

Your dictionary is created once per loop and makes your list comprehension about twice slower than one where you cached the dictionary:

>>> %timeit thing1('small', [1, 2, 3, 4, 5, 6])
100000 loops, best of 3: 2.4 us per loop
>>> %timeit thing2('small', [1, 2, 3, 4, 5, 6])
1000000 loops, best of 3: 1.06 us per loop

thing1 was your original function. thing2 is:

d = {
    'large': lambda item: item > 100,
    'small': lambda item: item < 100,
}

def thing3(mode, data):
    return list(filter(d[mode], data))

filter(f, data) is shorthand for item for item in data if f(item) . In Python 3 it creates an iterator, which filters out items only as you iterate over it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM