简体   繁体   English

Python生成器与列表理解冲突

[英]Python generator conflicting with list comprehension

I've been messing around in Python with generator functions. 我一直在使用生成器函数搞乱Python。 I want to write a function that took a generator whose values were tuples, and returns a list of generators, where each generator's values correspond to one index in the original tuple. 我想编写一个函数,它接受一个值为元组的生成器,并返回一个生成器列表,其中每个生成器的值对应于原始元组中的一个索引。

Currently, I have a function which accomplishes this for a hardcoded number of elements in the tuple. 目前,我有一个函数可以实现元组中硬编码元素的数量。 Here is my code: 这是我的代码:

import itertools

def tee_pieces(generator):
    copies = itertools.tee(generator)
    dropped_copies = [(x[0] for x in copies[0]), (x[1] for x in copies[1])]
    # dropped_copies = [(x[i] for x in copies[i]) for i in range(2)]
    return dropped_copies

def gen_words():
    for i in "Hello, my name is Fred!".split():
        yield i

def split_words(words):
    for word in words:
        yield (word[:len(word)//2], word[len(word)//2:])

def print_words(words):
    for word in words:
        print(word)

init_words = gen_words()
right_left_words = split_words(init_words)
left_words, right_words = tee_pieces(right_left_words)
print("Left halves:")
print_words(left_words)
print("Right halves:")
print_words(right_words)

This correctly splits the generator, leading to left_words containing the left halves and right_words containing the right halves. 这正确地分割了生成器,导致left_words包含左半部分,right_words包含右半部分。

The problem comes when I try to parameterize the number of generators to be created, using the commented out line above. 当我尝试使用上面注释掉的行来参数化要创建的生成器的数量时出现问题。 As far as I know it should be equivalent, but when I use that line instead, both left_words and right_words end up containg the right half of the word, giving an output like this: 据我所知它应该是等价的,但是当我使用那条线时,left_words和right_words最终都会包含该字的右半部分,给出如下输出:

Left halves:
lo,
y
me
s
ed!
Right halves:
lo,
y
me
s
ed!

Why is this happening? 为什么会这样? How can I accommplish the desired result, namely parameterize the number of pieces to split the generator into? 我怎样才能达到理想的结果,即参数化将发电机分成多少件?

This has to do with python's lexical scoping rules. 这与python的词法范围规则有关。 The classical "surprising" example for demonstrating it: 展示它的经典“令人惊讶”的例子:

funcs = [ lambda: i for i in range(3) ]
print(funcs[0]())
=> 2  #??
print(funcs[1]())
=> 2  #??
print(funcs[2]())
=> 2

Your examples is another result of the same rules. 您的示例是相同规则的另一个结果。

To fix, you can "break" the scoping with an additional function: 要修复,您可以使用附加功能“中断”范围界定:

def make_gen(i):
    return (x[i] for x in copies[i])
dropped_copies = [make_gen(i) for i in range(2)]

This binds the the value of i to the specific value passed to a specific call to make_gen , which achieves the desired behavior. make_gen i的值绑定到传递给make_gen的特定调用的特定值,从而实现所需的行为。 Without it, it is bound the "the current value of the variable named i", which ends up as the same value for all generators you create (as there's only one variable named i ). 如果没有它,它将绑定“名为i的变量的当前值”,它最终为您创建的所有生成器的相同值(因为只有一个名为i变量)。

Too add to shx2's answer, you could also substitute the additional function by a lambda: 要添加到shx2的答案,您还可以用lambda替换附加函数:

dropped_copies = [(lambda j: (x[j] for x in copies[j]))(i) for i in range(2)]

This too creates a new scope when the lambda gets called, as is abundantly clear by the different variable name. 当lambda被调用时,这也会创建一个新的范围,因为不同的变量名称非常清楚。 It would however also work with using the same name, since the parameter inside the lambda shadows the one inside the generator: 然而,它也可以使用相同的名称,因为lambda中的参数会影响生成器内的参数:

dropped_copies = [(lambda i: (x[i] for x in copies[i]))(i) for i in range(2)]

This sort of scoping seems very confusing but becomes more intuitive if you rewrite the generator as a for loop: 这种范围似乎非常混乱,但如果将生成器重写为for循环则变得更直观:

dropped_copies = []
for i in range(2):
    dropped_copies.append((x[i] for x in copies[i]))

Note that this is broken in the same way the original list comprehension version is. 请注意,这与原始列表推导版本相同。

This is because dropped_copies is a pair of iterators, and when the iterators are evaluated, i has already been incremented to 1. 这是因为dropped_copies是一对迭代器,当迭代器被计算时, i已经增加到1。

Try use list comprehension, you can see the difference: 尝试使用list comprehension,你可以看到区别:

dropped_copies = [[x[i] for x in copies[i]] for i in range(2)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM