简体   繁体   English

执行生成器表达式的最pythonic 方式是什么?

[英]What is the most pythonic way to have a generator expression executed?

More and more features of Python move to be "lazy executable", like generator expressions and other kind of iterators. Python 的越来越多的特性变成了“懒惰的可执行文件”,比如生成器表达式和其他类型的迭代器。 Sometimes, however, I see myself wanting to roll a one liner "for" loop, just to perform some action.然而,有时,我看到自己想要滚动一个单行“for”循环,只是为了执行一些操作。

What would be the most pythonic thing to get the loop actually executed?让循环实际执行的最pythonic的事情是什么?

For example:例如:

a = open("numbers.txt", "w")
(a.write ("%d " % i) for i in xrange(100))
a.close()

Not actuall code, but you see what I mean.不是实际代码,但你明白我的意思。 If I use a list generator, instead, I have the side effect of creating a N-lenght list filled with "None"'s.相反,如果我使用列表生成器,则会产生创建一个填充“无”的 N 长度列表的副作用。

Currently what I do is to use the expression as the argument in a call to "any" or to "all".目前我所做的是在调用“any”或“all”时使用该表达式作为参数。 But I would like to find a way that would not depend on the result of the expression performed in the loop - both "any" and "all" can stop depending on the expression evaluated.但我想找到一种不依赖于在循环中执行的表达式结果的方法——“any”和“all”都可以根据所评估的表达式停止。

To be clear, these are ways to do it that I already know about, and each one has its drawbacks:需要明确的是,这些是我已经知道的方法,每种方法都有其缺点:

[a.write ("%d " % i) for i in xrange(100))]

any((a.write ("%d " % i) for i in xrange(100)))

for item in (a.write ("%d " % i) for i in xrange(100)): pass

There is one obvious way to do it, and that is the way you should do it.有一种显而易见的方法可以做到,那就是您应该这样做的方法。 There is no excuse for doing it a clever way.没有借口以聪明的方式做到这一点。

a = open("numbers.txt", "w")
for i in xrange(100):
    a.write("%d " % i)
d.close()

Lazy execution gives you a serious benefit: It allows you to pass a sequence to another piece of code without having to hold the entire thing in memory.延迟执行给你带来了很大的好处:它允许你将一个序列传递给另一段代码,而不必将整个事情保存在内存中。 It is for the creation of efficient sequences as data types.它用于创建高效序列作为数据类型。

In this case, you do not want lazy execution.在这种情况下,您不希望延迟执行。 You want execution.你想要执行。 You can just ... execute.你可以……执行。 With a for loop.带有for循环。

If I wanted to do this specific example, I'd write如果我想做这个具体的例子,我会写

for i in xrange(100): a.write('%d ' % i)

If I often needed to consume an iterator for its effect, I'd define如果我经常需要使用迭代器来实现它的效果,我会定义

def for_effect(iterable):
    for _ in iterable:
        pass

There are many accumulators which have the effect of consuming the whole iterable they're given, such as min or max -- but even they don't ignore entirely the results yielded in the process ( min and max , for example, will raise an exception if some of the results are complex numbers).有许多accumulators具有消耗它们给定的整个迭代的效果,例如minmax - 但即使它们也不会完全忽略过程中产生的结果(例如, minmax会引发如果某些结果是复数,则例外)。 I don't think there's a built-in accumulator that does exactly what you want -- you'll have to write (and add to your personal stash of tiny utility function) a tiny utility function such as我不认为有一个内置的累加器可以完全满足您的要求——您必须编写(并添加到您个人的微型实用程序函数中)一个微型实用程序函数,例如

def consume(iterable):
    for item in iterable: pass

The main reason, I guess, is that Python has a for statement and you're supposed to use it when it fits like a glove (ie, for the cases you'd want consume for;-).我猜,主要原因是 Python 有一个for语句,你应该在它像手套一样合身时使用它(即,对于你想要consume的情况;-)。

BTW, a.write returns None , which is falsish, so any will actually consume it (and a.writelines will do even better!).顺便说一句, a.write返回None ,这是假的,所以any实际上都会消耗它(并且a.writelines会做得更好!)。 But I realize you were just giving that as an example;-).但我意识到你只是举个例子;-)。

It is 2019 - and this is a question from 2010 that keeps showing up.现在是 2019 年 - 这是 2010 年的一个问题,不断出现。 A recent thread in one of Python's mailing lists spammed over 70 e-mails on this subject , and they refused again to add a consume call to the language. Python 的邮件列表中的一个最近线程发送了 70 多封关于这个主题的垃圾邮件,他们再次拒绝向该语言添加consume调用。

On that thread, the most efficient mode to that actually showed up, and it is far from being obvious, so I am posting it as the answer here:在该线程上,实际出现了最有效的模式,而且远非显而易见,因此我将其发布为此处的答案:

import deque

consume = deque(maxlen=0).extend 

And then use the consume callable to process generator expressions.然后使用consume可调用来处理生成器表达式。

It turns out the deque native code in cPython actually is optimized for the maxlen=0 case, and will just consume the iterable.事实证明,cPython 中的deque本机代码实际上针对maxlen=0情况进行了优化,并且只会消耗可迭代对象。
The any and all calls I mentioned in the question should be equally as efficient, but one has to worry about the expression truthiness in order for the iterable to be consumed.我在问题中提到的anyall调用应该同样有效,但是为了消耗可迭代对象,必须担心表达式真实性。


I see this still may be controversial, after all, an explicit two line for loop can handle this - I remembered this question because I just made a commit where I create some threads, start then, and join then back - without a consume callable, that is 4 lines with mostly boiler plate, and without benefiting from cycling through the iterable in native code: https://github.com/jsbueno/extracontext/blob/a5d24be882f9aa18eb19effe3c2cf20c42135ed8/tests/test_thread.py#L27我认为这仍然可能存在争议,毕竟,明确的两行 for 循环可以处理这个问题 - 我记得这个问题,因为我刚刚提交了一个提交,在那里我创建了一些线程,然后开始,然后加入然后返回 - 没有consume可调用,那是 4 行,主要是样板,并且没有从本机代码中的可迭代循环中受益: https : //github.com/jsbueno/extracontext/blob/a5d24be882f9aa18eb19effe3c2cf20c42135ed8/tests/test_thread.py#L27

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM