[英]Is an intermediate list necessary in a multi-level list comprehension
Here is a specific example: 这是一个具体的例子:
my_dict={k:int(encoded_value)
for (k,encoded_value) in
[encoded_key_value.split('=') for encoded_key_value in
many_encoded_key_values.split(',')]}
The question is about the internal list [], can it be avoided, eg: 问题是关于内部列表[],是否可以避免,例如:
# This will not parse
my_dict={k:int(encoded_value)
for (k,encoded_value) in
encoded_key_value.split('=') for encoded_key_value in
many_encoded_key_values.split(',')}
..., which is invalid syntax: ...,这是无效的语法:
NameError: name 'encoded_key_value' is not defined
Sample data: aa=1,bb=2,cc=3,dd=4,ee=-5
样本数据: aa=1,bb=2,cc=3,dd=4,ee=-5
As was mentioned, generator expression will enhance your approach avoiding creating inner list. 如前所述,生成器表达式将增强您的方法,避免创建内部列表。 But there is a shorter way to obtain the needed result, using re.findall()
function: 但是使用re.findall()
函数可以获得所需结果的更短方法:
result = {k:int(v) for k,v in re.findall(r'(\w+)=([^,]+)', many_encoded_key_values)}
print(result)
The output: 输出:
{'dd': 4, 'aa': 1, 'bb': 2, 'ee': -5, 'cc': 3}
The alternative approach would be using re.finditer()
function which returns 'callable_iterator'
instance: 另一种方法是使用re.finditer()
函数返回'callable_iterator'
实例:
result = {m.group(1):int(m.group(2)) for m in re.finditer(r'(\w+)=([^,]+)', many_encoded_key_values)}
you could avoid creating an intermediate list by using an intermediate generator expression: 您可以通过使用中间生成器表达式来避免创建中间列表:
my_dict={k:int(encoded_value)
for (k,encoded_value) in
(encoded_key_value.split('=') for encoded_key_value in
many_encoded_key_values.split(','))}
syntax-wise this is almost the same; 语法方面,这几乎是一样的; instead of generating an intermediate list first and then using the elements, the elements are consumed on the fly. 不是首先生成中间列表然后使用元素,而是动态消耗元素。
making this overly verbose you could use a 'data pipeline' that consist of generators: 使这个过于冗长,您可以使用由生成器组成的“数据管道”:
eq_statements = (item.strip() for item in many_encoded_key_values.split(','))
var_i = (var_i.split('=') for var_i in eq_statements)
my_dict = {var: int(i) for var, i in var_i}
print(my_dict)
(unfortunately .split
does not return a generator so considering saving space this is not of much use... for handling large files things like this may come in handy.) (不幸的是.split
没有返回发电机,所以考虑节省空间这没什么用处......对于处理大型文件这样的事情可能会派上用场。)
found this answer which has split
as an iterator. 发现这个答案已经split
为迭代器。 just in case... 以防万一...
FWIW, here's a functional approach: FWIW,这是一种功能性方法:
def convert(s):
k, v = s.split('=')
return k, int(v)
d = dict(map(convert, data.split(',')))
print(d)
output 产量
{'aa': '1', 'bb': '2', 'cc': '3', 'dd': '4', 'ee': '-5'}
a simple and compact variant that is very close to your original attempt: 一个简单而紧凑的变体,非常接近您原来的尝试:
d = {v.strip(): int(i) for s in data.split(',') for v, i in (s.split('='),)}
the only additional 'trick' was to wrap s.split('=')
inside a tuple (surrounding it with parentheses: (s.split('='),)
) in order to get both elements of split
in the same for
iteration. 唯一的额外“特技”是包裹s.split('=')
的元组内:(带括号包围它(s.split('='),)
以获得的两个元件) split
在同一for
迭代。 the rest is straightforward. 其余的很简单。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.