简体   繁体   English

是多级列表理解中必需的中间列表

[英]Is an intermediate list necessary in a multi-level list comprehension

Here is a specific example: 这是一个具体的例子:

my_dict={k:int(encoded_value) 
         for (k,encoded_value) in 
             [encoded_key_value.split('=') for encoded_key_value in 
              many_encoded_key_values.split(',')]}

The question is about the internal list [], can it be avoided, eg: 问题是关于内部列表[],是否可以避免,例如:

# This will not parse
my_dict={k:int(encoded_value) 
         for (k,encoded_value) in 
             encoded_key_value.split('=') for encoded_key_value in 
             many_encoded_key_values.split(',')}

..., which is invalid syntax: ...,这是无效的语法:

NameError: name 'encoded_key_value' is not defined

Sample data: aa=1,bb=2,cc=3,dd=4,ee=-5 样本数据: aa=1,bb=2,cc=3,dd=4,ee=-5

As was mentioned, generator expression will enhance your approach avoiding creating inner list. 如前所述,生成器表达式将增强您的方法,避免创建内部列表。 But there is a shorter way to obtain the needed result, using re.findall() function: 但是使用re.findall()函数可以获得所需结果的更短方法:

result = {k:int(v) for k,v in re.findall(r'(\w+)=([^,]+)', many_encoded_key_values)}
print(result)

The output: 输出:

{'dd': 4, 'aa': 1, 'bb': 2, 'ee': -5, 'cc': 3}

The alternative approach would be using re.finditer() function which returns 'callable_iterator' instance: 另一种方法是使用re.finditer()函数返回'callable_iterator'实例:

result = {m.group(1):int(m.group(2)) for m in re.finditer(r'(\w+)=([^,]+)', many_encoded_key_values)}

you could avoid creating an intermediate list by using an intermediate generator expression: 您可以通过使用中间生成器表达式来避免创建中间列表:

my_dict={k:int(encoded_value)
         for (k,encoded_value) in
             (encoded_key_value.split('=') for encoded_key_value in
              many_encoded_key_values.split(','))}

syntax-wise this is almost the same; 语法方面,这几乎是一样的; instead of generating an intermediate list first and then using the elements, the elements are consumed on the fly. 不是首先生成中间列表然后使用元素,而是动态消耗元素。


making this overly verbose you could use a 'data pipeline' that consist of generators: 使这个过于冗长,您可以使用由生成器组成的“数据管道”:

eq_statements = (item.strip() for item in many_encoded_key_values.split(','))
var_i = (var_i.split('=') for var_i in eq_statements)
my_dict = {var: int(i) for var, i in var_i}
print(my_dict)

(unfortunately .split does not return a generator so considering saving space this is not of much use... for handling large files things like this may come in handy.) (不幸的是.split没有返回发电机,所以考虑节省空间这没什么用处......对于处理大型文件这样的事情可能会派上用场。)

found this answer which has split as an iterator. 发现这个答案已经split为迭代器。 just in case... 以防万一...

FWIW, here's a functional approach: FWIW,这是一种功能性方法:

def convert(s):
    k, v = s.split('=')
    return k, int(v)

d = dict(map(convert, data.split(',')))
print(d)

output 产量

{'aa': '1', 'bb': '2', 'cc': '3', 'dd': '4', 'ee': '-5'}

a simple and compact variant that is very close to your original attempt: 一个简单而紧凑的变体,非常接近您原来的尝试:

d = {v.strip(): int(i) for s in data.split(',') for v, i in (s.split('='),)}

the only additional 'trick' was to wrap s.split('=') inside a tuple (surrounding it with parentheses: (s.split('='),) ) in order to get both elements of split in the same for iteration. 唯一的额外“特技”是包裹s.split('=')的元组内:(带括号包围它(s.split('='),)以获得的两个元件) split在同一for迭代。 the rest is straightforward. 其余的很简单。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM