简体   繁体   中英

Return or yield from a function that calls a generator?

I have a generator generator and also a convenience method to it - generate_all .

def generator(some_list):
  for i in some_list:
    yield do_something(i)

def generate_all():
  some_list = get_the_list()
  return generator(some_list) # <-- Is this supposed to be return or yield?

Should generate_all return or yield ? I want the users of both methods to use it the same, ie

for x in generate_all()

should be equal to

some_list = get_the_list()
for x in generate(some_list)

You're probably looking for Generator Delegation (PEP380)

For simple iterators, yield from iterable is essentially just a shortened form of for item in iterable: yield item

def generator(iterable):
  for i in iterable:
    yield do_something(i)

def generate_all():
  yield from generator(get_the_list())

It's pretty concise and also has a number of other advantages, such as being able to chain arbitrary/different iterables!

return generator(list) does what you want. But note that

yield from generator(list)

would be equivalent, but with the opportunity to yield more values after generator is exhausted. For example:

def generator_all_and_then_some():
    list = get_the_list()
    yield from generator(list)
    yield "one last thing"

Generators are lazy-evaluating so return or yield will behave differently when you're debugging your code or if an exception is thrown.

With return any exception that happens in your generator won't know anything about generate_all , that's because when generator is really executed you have already left the generate_all function. With yield in there it will have generate_all in the traceback.

def generator(some_list):
    for i in some_list:
        raise Exception('exception happened :-)')
        yield i

def generate_all():
    some_list = [1,2,3]
    return generator(some_list)

for item in generate_all():
    ...
Exception                                 Traceback (most recent call last)
<ipython-input-3-b19085eab3e1> in <module>
      8     return generator(some_list)
      9 
---> 10 for item in generate_all():
     11     ...

<ipython-input-3-b19085eab3e1> in generator(some_list)
      1 def generator(some_list):
      2     for i in some_list:
----> 3         raise Exception('exception happened :-)')
      4         yield i
      5 

Exception: exception happened :-)

And if it's using yield from :

def generate_all():
    some_list = [1,2,3]
    yield from generator(some_list)

for item in generate_all():
    ...
Exception                                 Traceback (most recent call last)
<ipython-input-4-be322887df35> in <module>
      8     yield from generator(some_list)
      9 
---> 10 for item in generate_all():
     11     ...

<ipython-input-4-be322887df35> in generate_all()
      6 def generate_all():
      7     some_list = [1,2,3]
----> 8     yield from generator(some_list)
      9 
     10 for item in generate_all():

<ipython-input-4-be322887df35> in generator(some_list)
      1 def generator(some_list):
      2     for i in some_list:
----> 3         raise Exception('exception happened :-)')
      4         yield i
      5 

Exception: exception happened :-)

However this comes at the cost of performance. The additional generator layer does have some overhead. So return will be generally a bit faster than yield from ... (or for item in ...: yield item ). In most cases this won't matter much, because whatever you do in the generator typically dominates the run-time so that the additional layer won't be noticeable.

However yield has some additional advantages: You aren't restricted to a single iterable, you can also easily yield additional items:

def generator(some_list):
    for i in some_list:
        yield i

def generate_all():
    some_list = [1,2,3]
    yield 'start'
    yield from generator(some_list)
    yield 'end'

for item in generate_all():
    print(item)
start
1
2
3
end

In your case the operations are quite simple and I don't know if it's even necessary to create multiple functions for this, one could easily just use the built-in map or a generator expression instead:

map(do_something, get_the_list())          # map
(do_something(i) for i in get_the_list())  # generator expression

Both should be identical (except for some differences when exceptions happen) to use. And if they need a more descriptive name, then you could still wrap them in one function.

There are multiple helpers that wrap very common operations on iterables built-in and further ones can be found in the built-in itertools module. In such simple cases I would simply resort to these and only for non-trivial cases write your own generators.

But I assume your real code is more complicated so that may not be applicable but I thought it wouldn't be a complete answer without mentioning alternatives.

The following two statements will appear to be functionally equivalent in this particular case:

return generator(list)

and

yield from generator(list)

The later is approximately the same as

for i in generator(list):
    yield i

The return statement returns the generator you are looking for. A yield from or yield statement turns your whole function into something that returns a generator, which passes through the one you are looking for.

From a user point of view, there is no difference. Internally, however, the return is arguably more efficient since it does not wrap generator(list) in a superfluous pass-thru generator. If you plan on doing any processing on the elements of the wrapped generator, use some form of yield of course.

You would return it.

yield ing* would cause generate_all() to evaluate to a generator itself, and calling next on that outer generator would return the inner generator returned by the first function, which isn't what you'd want.

* Not including yield from

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM