I have two functions returning generator:
def f1():
return (i for i in range(1000))
def f2():
return ((yield i) for i in range(1000))
Apparently, generator returned from f2()
is twice as slower than f1()
:
Python 3.6.5 (default, Apr 1 2018, 05:46:30)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit, dis
>>> timeit.timeit("list(f1())", globals=globals(), number=1000)
0.057948426001530606
>>> timeit.timeit("list(f2())", globals=globals(), number=1000)
0.09769760200288147
I tried to using dis to see what's going on but to no avail:
>>> dis.dis(f1)
2 0 LOAD_CONST 1 (<code object <genexpr> at 0x7ffff7ec6d20, file "<stdin>", line 2>)
2 LOAD_CONST 2 ('f1.<locals>.<genexpr>')
4 MAKE_FUNCTION 0
6 LOAD_GLOBAL 0 (range)
8 LOAD_CONST 3 (1000)
10 CALL_FUNCTION 1
12 GET_ITER
14 CALL_FUNCTION 1
16 RETURN_VALUE
>>> dis.dis(f2)
2 0 LOAD_CONST 1 (<code object <genexpr> at 0x7ffff67a25d0, file "<stdin>", line 2>)
2 LOAD_CONST 2 ('f2.<locals>.<genexpr>')
4 MAKE_FUNCTION 0
6 LOAD_GLOBAL 0 (range)
8 LOAD_CONST 3 (1000)
10 CALL_FUNCTION 1
12 GET_ITER
14 CALL_FUNCTION 1
16 RETURN_VALUE
Apparently, the results from dis
are the same.
So why is generator returned from f1()
faster than generator from f2()
? And what is proper way to debug this? Apparently dis
in this case fails.
EDIT 1:
Using next()
instead of list()
in timeit reverses the results (or they are the same in some cases):
>>> timeit.timeit("next(f1())", globals=globals(), number=10**6)
1.0030477920008707
>>> timeit.timeit("next(f2())", globals=globals(), number=10**6)
0.9416838550023385
EDIT 2:
Apparently it's bug in Python, fixed in 3.8. See yield in list comprehensions and generator expressions
Generator with yield inside actually yields two values.
Yield in generator expressions is actually a bug, as discussed in this related question .
If you want to actually see what's going on with dis, you need to introspect on the code object's co_const[0]
, so:
>>> dis.dis(f1.__code__.co_consts[1])
2 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 11 (to 17)
6 STORE_FAST 1 (i)
9 LOAD_FAST 1 (i)
12 YIELD_VALUE
13 POP_TOP
14 JUMP_ABSOLUTE 3
>> 17 LOAD_CONST 0 (None)
20 RETURN_VALUE
>>> dis.dis(f2.__code__.co_consts[1])
2 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 12 (to 18)
6 STORE_FAST 1 (i)
9 LOAD_FAST 1 (i)
12 YIELD_VALUE
13 YIELD_VALUE
14 POP_TOP
15 JUMP_ABSOLUTE 3
>> 18 LOAD_CONST 0 (None)
21 RETURN_VALUE
So, it yields twice.
Maybe it is because the generator returned by f2
returns twice the number of elements.
Just look what happens:
>>> def f2():
return ((yield i) for i in range(10))
>>> g = f2()
>>> print([i for i in g])
[0, None, 1, None, 2, None, 3, None, 4, None, 5, None, 6, None, 7, None, 8, None, 9, None]
Using yield
in a generator returns a None
element after each actual item.
Your formulation of f2()
is the first thing that stroke me as unusual . Writing f2()
the way most people would yields (pun intended) very different results:
import timeit
def f1():
return (i for i in range(1000))
def f2():
for i in range(1000):
yield i
res1 = timeit.timeit("list(f1())", globals=globals(), number=1000)
print(res1) # 0.05318646361085916
res2 = timeit.timeit("list(f2())", globals=globals(), number=1000)
print(res2) # 0.05284952304875785
So the two seem to be equally fast.
As others say in their answers, this probably has to do with the fact that your f2()
returns twice as many elements.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.