简体   繁体   English

生成器表达式Python

[英]Generator expressions Python

I have a list of dictionaries like the following: 我有一个字典列表,如下所示:

lst = [{'a': 5}, {'b': 6}, {'c': 7}, {'d': 8}]

I wrote a generator expression like: 我写了一个生成器表达式,如:

next((itm for itm in lst if itm['a']==5))

Now the strange part is that though this works for the key value pair of 'a' it throws an error for all other expressions the next time. 现在奇怪的是,尽管这适用于'a'的键值对,但下次会为所有其他表达式抛出错误。 Expression: 表达:

next((itm for itm in lst if itm['b']==6))

Error: 错误:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <genexpr>
KeyError: 'b'

That's not weird. 这并不奇怪。 For every itm in the lst . 对于每一个itmlst It will first evaluate the filter clause . 它将首先评估过滤条款 Now if the filter clause is itm['b'] == 6 , it will thus try to fetch the 'b' key from that dictionary. 现在,如果filter子句是itm['b'] == 6 ,那么它将尝试从该字典中获取'b'键。 But since the first dictionary has no such key, it will raise an error. 但由于第一个字典没有这样的密钥,它会引发错误。

For the first filter example, that is not a problem, since the first dictionary has an 'a' key. 对于第一个过滤器示例,这不是问题,因为第一个字典具有 'a'键。 The next(..) is only interested in the first element emitted by the generator. next(..)仅对生成器发出的第一个元素感兴趣。 So it never asks to filter more elements. 所以它永远不会要求过滤更多的元素。

You can use .get(..) here to make the lookup more failsafe: 你可以在这里使用.get(..)来使查找更加安全:

next((itm for itm in lst if itm.get('b',None)==6))

In case the dictionary has no such key, the .get(..) part will return None . 如果字典没有这样的键, .get(..)部分将返回None And since None is not equal to 6, the filter will thus omit the first dictionary and look further for another match. 并且由于None不等于6,因此过滤器将省略第一个字典并进一步查看另一个匹配。 Note that if you do not specify a default value , None is the default value, so an equivalent statement is: 请注意,如果未指定默认值 ,则None为默认值,因此等效语句为:

next((itm for itm in lst if itm.get('b')==6))

We can also omit the parenthesis of the generator: only if there are multiple arguments, we need these additional parenthesis: 我们也可以省略生成器的括号:只有当有多个参数时,我们才需要这些附加的括号:

next(itm for itm in lst if itm.get('b')==6)

Take a look at your generator expression separately: 分别看看你的生成器表达式:

(itm for itm in lst if itm['a']==5)

This will collect all items in the list where itm['a'] == 5 . 这将收集列表中的所有项目,其中itm['a'] == 5 So far so good. 到现在为止还挺好。

When you call next() on it, you tell Python to generate the first item from that generator expression. 当你调用next()时,你告诉Python从该生成器表达式生成第一个项目。 But only the first. 但只有第一个。

So when you have the condition itm['a'] == 5 , the generator will take the first element of the list, {'a': 5} and perform the check on it. 因此,当你有条件itm['a'] == 5 ,生成器将获取列表的第一个元素{'a': 5}并对其执行检查。 The condition is true, so that item is generated by the generator expression and returned by next() . 条件为true,因此该项由生成器表达式生成并由next()返回。

Now, when you change the condition to itm['b'] == 6 , the generator will again take the first element of the list, {'a': 5} , and attempt to get the element with the key b . 现在,当您将条件更改为itm['b'] == 6 ,生成器将再次获取列表的第一个元素{'a': 5} ,并尝试使用键b获取元素。 This will fail: 这将失败:

>>> itm = {'a': 5}
>>> itm['b']
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    itm['b']
KeyError: 'b'

It does not even get the chance to look at the second element because it already fails while trying to look at the first element. 它甚至没有机会查看第二个元素,因为它在尝试查看第一个元素时已经失败了。

To solve this, you have to avoid using an expression that can raise a KeyError here. 要解决此问题,您必须避免使用可在此处引发KeyError的表达式。 You could use dict.get() to attempt to retrieve the value without raising an exception: 您可以使用dict.get()尝试检索值而不引发异常:

>>> lst = [{'a': 5}, {'b': 6}, {'c': 7}, {'d': 8}]
>>> next((itm for itm in lst if itm.get('b') == 6))
{'b': 6}

Obviously itm['b'] will raise a KeyError if there is no 'b' key in a dictionary. 显然, itm['b']将引发KeyError ,如果没有'b'在字典中的关键。 One way would be to do 一种方法是做

next((itm for itm in lst if 'b' in itm and itm['b']==6))

If you don't expect None in any of the dictionaries then you can simplify it to 如果你不想在任何词典中使用None ,那么你可以简化它

next((itm for itm in lst if itm.get('b')==6))

(this will work the same since you compare to 6 , but it would give wrong result if you would compare to None ) (这与你比较6的效果相同,但是如果你比较None则会得到错误的结果)

or safely with a placeholder 或安全地使用占位符

PLACEHOLDER = object()
next((itm for itm in lst if itm.get('b', PLACEHOLDER)==6))

Indeed, your structure is a list of dictionaries . 实际上,您的结构是一个字典列表

>>> lst = [{'a': 5}, {'b': 6}, {'c': 7}, {'d': 8}]

To get a better idea of what is happening with your first condition, try this: 为了更好地了解您的第一个条件发生了什么,请尝试以下方法:

>>> gen = (itm for itm in lst if itm['a'] == 5)
>>> next(gen)
{'a': 5}
>>> next(gen)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <genexpr>
KeyError: 'a'

Each time you call next , you process the next element and return an item. 每次调用next ,都会处理下一个元素并返回一个项目。 Also... 也...

next((itm for itm in lst if itm['a'] == 5))

Creates a generator that is not assigned to any variable, processes the first element in the lst , sees that key 'a' does indeed exist, and return the item. 创建一个未分配给任何变量的生成器,处理lst的第一个元素,看到键'a'确实存在,并返回该项。 The generator is then garbage collected. 然后垃圾收集发生器。 The reason an error is not thrown is because the first item in lst does indeed contain this key. 不抛出错误的原因是因为lst中的第一项确实包含此密钥。

So, if you changed the key to be something that the first item does not contain, you get the error you saw: 因此,如果您将密钥更改为第一个项目不包含的内容,则会收到您看到的错误:

>>> gen = (itm for itm in lst if itm['b'] == 6)
>>> next(gen)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <genexpr>
KeyError: 'b'

The Solution 解决方案

Well, one solution as already discussed is to use the dict.get function. 好吧,已经讨论过的一个解决方案是使用dict.get函数。 Here's another alternative using defaultdict : 这是使用defaultdict的另一种选择:

from collections import defaultdict
from functools import partial

f = partial(defaultdict, lambda: None)

lst = [{'a': 5}, {'b': 6}, {'c': 7}, {'d': 8}]
lst = [f(itm) for itm in lst] # create a list of default dicts

for i in (itm for itm in lst if itm['b'] == 6):
    print(i)

This prints out: 打印出:

defaultdict(<function <lambda> at 0x10231ebf8>, {'b': 6})

The defaultdict will return None in the event of the key not being present. 如果密钥不存在, defaultdict将返回None

Maybe you can try this: 也许你可以试试这个:

next(next((itm for val in itm.values() if val == 6) for itm in lst))

This may be a little tricky, it generate two-tier generator , thus you need two next to get the result. 这可能是一个有点棘手,它产生两层generator ,因此,你需要两个next获得的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM