[英]List Comprehension and Generators to avoid computing the same value twice when using conditional expressions
假设您有一些昂贵的 CPU 密集型 function,例如解析 xml 字符串。 在这种情况下,我们的简单 function 将是:
def parse(foo):
return int(foo)
作为输入,您有一个字符串列表,并且您想要解析它们并找到满足某些条件的解析字符串的子集。 理想情况下,我们希望每个字符串只执行一次解析。
如果没有列表理解,您可以:
olds = ["1", "2", "3", "4", "5"]
news = []
for old in olds:
new = parse(old) # First and only Parse
if new > 3:
news.append(new)
要将此作为列表理解,您似乎必须执行两次解析,一次获取新值,一次执行条件检查:
olds = ["1", "2", "3", "4", "5"]
news = [
parse(new) # First Parse
for new in olds
if parse(new) > 3 # Second Parse
]
例如,此语法将不起作用:
olds = ["1", "2", "3", "4", "5"]
# Raises SyntaxError: can't assign to function call
news = [i for parse(i) in olds if i > 5]
使用生成器似乎有效:
def parse(strings):
for string in strings:
yield int(string)
olds = ["1", "2", "3", "4", "5"]
news = [i for i in parse(olds) if i > 3]
但是,您可以在生成器中抛出条件:
def parse(strings):
for string in strings:
val = int(string)
if val > 3:
yield val
olds = ["1", "2", "3", "4", "5"]
news = [i for i in parse(olds)]
我想知道的是,就优化(而不是可重用性等)而言,哪一个更好,在生成器中进行解析但在列表理解中进行条件检查,或者两者都解析并且条件检查发生在生成器中? 有比这两种方法更好的选择吗?
以下是 Python 3.6.5 中dis.dis
的一些 output。 请注意,在我的 Python 版本中,为了反汇编列表推导,我们必须使用f.__code__.co_consts[1]
。 检查这个答案以获得解释。
def parse(strings):
for string in strings:
yield int(string)
def main(strings):
return [i for i in parse(strings) if i > 3]
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main.__code__.co_consts[1])
"""
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (3)
12 COMPARE_OP 4 (>)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 SETUP_LOOP 22 (to 24)
2 LOAD_FAST 0 (strings)
4 GET_ITER
>> 6 FOR_ITER 14 (to 22)
8 STORE_FAST 1 (string)
3 10 LOAD_GLOBAL 0 (int)
12 LOAD_FAST 1 (string)
14 CALL_FUNCTION 1
16 YIELD_VALUE
18 POP_TOP
20 JUMP_ABSOLUTE 6
>> 22 POP_BLOCK
>> 24 LOAD_CONST 0 (None)
26 RETURN_VALUE
"""
def parse(strings):
for string in strings:
val = int(string)
if val > 3:
yield val
def main(strings):
return [i for i in parse(strings)]
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main.__code__.co_consts[1])
"""
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 8 (to 14)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LIST_APPEND 2
12 JUMP_ABSOLUTE 4
>> 14 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 SETUP_LOOP 34 (to 36)
2 LOAD_FAST 0 (strings)
4 GET_ITER
>> 6 FOR_ITER 26 (to 34)
8 STORE_FAST 1 (string)
3 10 LOAD_GLOBAL 0 (int)
12 LOAD_FAST 1 (string)
14 CALL_FUNCTION 1
16 STORE_FAST 2 (val)
4 18 LOAD_FAST 2 (val)
20 LOAD_CONST 1 (3)
22 COMPARE_OP 4 (>)
24 POP_JUMP_IF_FALSE 6
5 26 LOAD_FAST 2 (val)
28 YIELD_VALUE
30 POP_TOP
32 JUMP_ABSOLUTE 6
>> 34 POP_BLOCK
>> 36 LOAD_CONST 0 (None)
38 RETURN_VALUE
def parse(string):
return int(string)
def main(strings):
values = []
for string in strings:
value = parse(string)
if value > 3:
values.append(value)
return values
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main)
"""
2 0 BUILD_LIST 0
2 STORE_FAST 1 (values)
3 4 SETUP_LOOP 38 (to 44)
6 LOAD_FAST 0 (strings)
8 GET_ITER
>> 10 FOR_ITER 30 (to 42)
12 STORE_FAST 2 (string)
4 14 LOAD_GLOBAL 0 (parse)
16 LOAD_FAST 2 (string)
18 CALL_FUNCTION 1
20 STORE_FAST 3 (value)
5 22 LOAD_FAST 3 (value)
24 LOAD_CONST 1 (3)
26 COMPARE_OP 4 (>)
28 POP_JUMP_IF_FALSE 10
6 30 LOAD_FAST 1 (values)
32 LOAD_ATTR 1 (append)
34 LOAD_FAST 3 (value)
36 CALL_FUNCTION 1
38 POP_TOP
40 JUMP_ABSOLUTE 10
>> 42 POP_BLOCK
7 >> 44 LOAD_FAST 1 (values)
46 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 LOAD_GLOBAL 0 (int)
2 LOAD_FAST 0 (string)
4 CALL_FUNCTION 1
6 RETURN_VALUE
"""
请注意前两个的反汇编,即使用带有生成器的列表推导,指示两个 for 循环,一个在主循环(列表推导)和一个在解析(生成器)中。 这并不像听起来那么糟糕,对吧? 例如,整个操作是 O(n) 而不是 O(n^2)?
def parse(string):
return int(string)
def main(strings):
return [val for val in (parse(string) for string in strings) if val > 3]
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main.__code__.co_consts[1])
"""
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (val)
8 LOAD_FAST 1 (val)
10 LOAD_CONST 0 (3)
12 COMPARE_OP 4 (>)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (val)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 LOAD_GLOBAL 0 (int)
2 LOAD_FAST 0 (string)
4 CALL_FUNCTION 1
6 RETURN_VALUE
"""
我认为您可以比您想象的更简单:
olds = ["1", "2", "3", "4", "5"]
news = [new for new in (parse(old) for old in olds) if new > 3]
要不就:
news = [new for new in map(parse, olds) if new > 3]
这两种方式parse
每个项目只调用一次。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.