简体   繁体   English

Python中不带[]的列表理解

[英]List comprehension without [ ] in Python

Joining a list: 加入清单:

>>> ''.join([ str(_) for _ in xrange(10) ])
'0123456789'

join must take an iterable. join必须是可迭代的。

Apparently, join 's argument is [ str(_) for _ in xrange(10) ] , and it's a list comprehension . 显然, join的参数是[ str(_) for _ in xrange(10) ] ,这是一个列表推导

Look at this: 看这个:

>>>''.join( str(_) for _ in xrange(10) )
'0123456789'

Now, join 's argument is just str(_) for _ in xrange(10) , no [] , but the result is the same. 现在, join的参数只是str(_) for _ in xrange(10) ,没有[] ,但结果是相同的。

Why? 为什么? Does str(_) for _ in xrange(10) also produce a list or an iterable? str(_) for _ in xrange(10)是否还会产生列表或可迭代项?

The other respondents were correct in answering that you had discovered a generator expression (which has a notation similar to list comprehensions but without the surrounding square brackets). 其他回答者的正确回答是您发现了一个生成器表达式 (该表达式的表达与列表推导类似,但没有方括号)。

In general, genexps (as they are affectionately known) are more memory efficient and faster than list comprehensions. 通常,genexps(众所周知)与列表推导相比,具有更高的存储效率和速度。

HOWEVER, it the case of ''.join() , a list comprehension is both faster and more memory efficient. 但是,对于''.join() ,列表理解既更快又更高效。 The reason is that join needs to make two passes over the data, so it actually needs a real list. 原因是联接需要对数据进行两次传递,因此它实际上需要一个真实的列表。 If you give it one, it can start its work immediately. 如果您给它一个,它可以立即开始工作。 If you give it a genexp instead, it cannot start work until it builds-up a new list in memory by running the genexp to exhaustion: 如果改为给它一个genexp,它将无法开始工作,直到它通过运行genexp到穷竭在内存中建立一个新列表:

~ $ python -m timeit '"".join(str(n) for n in xrange(1000))'
1000 loops, best of 3: 335 usec per loop
~ $ python -m timeit '"".join([str(n) for n in xrange(1000)])'
1000 loops, best of 3: 288 usec per loop

The same result holds when comparing itertools.imap versus map : 比较itertools.imapmap时, 得出相同的结果:

~ $ python -m timeit -s'from itertools import imap' '"".join(imap(str, xrange(1000)))'
1000 loops, best of 3: 220 usec per loop
~ $ python -m timeit '"".join(map(str, xrange(1000)))'
1000 loops, best of 3: 212 usec per loop
>>>''.join( str(_) for _ in xrange(10) )

This is called a generator expression , and is explained in PEP 289 . 这称为生成器表达式 ,并在PEP 289中进行了说明。

The main difference between generator expressions and list comprehensions is that the former don't create the list in memory. 生成器表达式和列表理解之间的主要区别在于,前者不在内存中创建列表。

Note that there's a third way to write the expression: 请注意,还有第三种编写表达式的方法:

''.join(map(str, xrange(10)))

Your second example uses a generator expression rather than a list comprehension. 您的第二个示例使用生成器表达式而不是列表推导。 The difference is that with the list comprehension, a list is completely built and passed to .join() . 区别在于列表理解可以完全构建列表并将其传递给.join() With the generator expression, items are generated one by one and consumed by .join() . 使用生成器表达式,项将一一生成,并由.join() The latter uses less memory and is generally faster. 后者使用较少的内存,并且通常更快。

As it happens, the list constructor will happily consume any iterable, including a generator expression. 碰巧的是,列表构造函数将愉快地使用任何可迭代的函数,包括生成器表达式。 So: 所以:

[str(n) for n in xrange(10)]

is just "syntactic sugar" for: 只是“语法糖”的意思:

list(str(n) for n in xrange(10))

In other words, a list comprehension is just a generator expression that is turned into a list. 换句话说,列表理解只是生成器表达式,它变成了列表。

As mentioned it's a generator expression . 如前所述,它是一个生成器表达式

From the documentation: 从文档中:

The parentheses can be omitted on calls with only one argument. 仅带有一个参数的调用可以省略括号。 See section Calls for the detail. 详细信息请参见“ 通话 ”部分。

If it's in parens, but not brackets, it's technically a generator expression. 如果在括号内,但不在方括号中,则从技术上讲 ,它是一个生成器表达式。 Generator expressions were first introduced in Python 2.4. 生成器表达式最早是在Python 2.4中引入的。

http://wiki.python.org/moin/Generators http://wiki.python.org/moin/Generators

The part after the join, ( str(_) for _ in xrange(10) ) is, by itself, a generator expression. ( str(_) for _ in xrange(10) )后的部分( str(_) for _ in xrange(10) )本身就是一个生成器表达式。 You could do something like: 您可以执行以下操作:

mylist = (str(_) for _ in xrange(10))
''.join(mylist)

and it means exactly the same thing that you wrote in the second case above. 它的含义与您在上述第二种情况下写的完全一样。

Generators have some very interesting properties, not the least of which is that they don't end up allocating an entire list when you don't need one. 生成器具有一些非常有趣的属性,其中最重要的是它们在不需要一个列表时最终不会分配整个列表。 Instead, a function like join "pumps" the items out of the generator expression one at a time, doing its work on the tiny intermediate parts. 取而代之的是,诸如join之类的函数一次将这些项从生成器表达式中“抽出”,从而在微小的中间部分上进行工作。

In your particular examples, list and generator probably don't perform terribly differently, but in general, I prefer using generator expressions (and even generator functions) whenever I can, mostly because it's extremely rare for a generator to be slower than a full list materialization. 在您的特定示例中,列表和生成器的性能可能并没有很大不同,但是总的来说,我更愿意在可能的情况下使用生成器表达式(甚至生成器函数),主要是因为生成器的速度比完整列表的情况很少见物化。

That's a generator, rather than a list comprehension. 那是一个生成器,而不是列表理解。 Generators are also iterables, but rather than creating the entire list first then passing it to join, it passes each value in the xrange one by one, which can be much more efficient. 生成器也是可迭代的,但是与其先创建整个列表然后再将其传递给联接,不如将其逐个传递xrange中的每个值,这可能会更有效率。

The argument to your second join call is a generator expression. 您的第二个join调用的参数是生成器表达式。 It does produce an iterable. 它确实产生了可迭代的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM