简体   繁体   English

在嵌套列表推导中仅使用一个列表中的项目

[英]Only using items from one list once in nested list comprehension

I'm trying to use list comprehension to generate a new list that consists of a letter taken from a list1 directly followed (after a colon) by the words from list2 that start with that particular letter. 我正在尝试使用列表推导来生成一个新列表,该列表包含从list1直接跟随(冒号后)的字母,其中list2中的单词以该特定字母开头。 I managed to code this using nested for loops as following: 我设法使用嵌套for循环对此进行编码,如下所示:

list1=["A","B"]
list2=["Apple","Banana","Balloon","Boxer","Crayons","Elephant"]

newlist=[]
for i in list1:
    newlist.append(i+":")
    for j in list2:
        if j[0]==i:
            newlist[-1]+=j+","

resulting in the intended result: ['A:Apple,', 'B:Banana,Balloon,Boxer,'] 产生预期结果: ['A:Apple,', 'B:Banana,Balloon,Boxer,']

Trying the same using list comprehension, I came up with the following: 尝试使用列表理解相同,我想出了以下内容:

list1=["A","B"]
list2=["Apple","Banana","Balloon","Boxer","Crayons","Elephant"]

newlist=[i+":"+j+"," for i in list1 for j in list2 if i==j[0]]

resulting in: ['A:Apple,', 'B:Banana,', 'B:Balloon,', 'B:Boxer,'] 导致: ['A:Apple,', 'B:Banana,', 'B:Balloon,', 'B:Boxer,']

In which each time a word with that starting letter is found, a new item is created in newlist , while my intention is to have one item per letter. 在每个与首字母词被发现的时候,一个新的项目中创建newlist ,而我的目的是使每个字母一个项目。

Is there a way to edit the list comprehension code in order to obtain the same result as using the nested for loops? 有没有办法编辑列表推导代码,以获得与使用嵌套for循环相同的结果?

All you need to do is to remove the second for loop and replace it with a ','.join(matching_words) call where you use j now in the string concatenation now: 你需要做的就是删除第二个for循环并用','.join(matching_words)调用替换它,你现在在字符串连接中使用j

newlist = ['{}:{}'.format(l, ','.join([w for w in list2 if w[0] == l])) for l in list1]

This isn't very efficient; 这不是很有效; you loop over all the words in list2 for each letter. 你为每个字母循环遍历list2 所有单词。 To do this efficiently, you would be better of to preprocess the lists into a dictionary: 要有效地执行此操作,您最好将列表预处理为字典:

list2_map = {}
for word in list2:
    list2_map.setdefault(word[0], []).append(word)

newlist = ['{}:{}'.format(l, ','.join(list2_map.get(l, []))) for l in list1]

The first loop builds a dictionary mapping initial letter to a list of words, so that you can directly use those lists instead of using a nested list comprehension. 第一个循环构建一个字典,将首字母映射到单词列表,这样您就可以直接使用这些列表而不是使用嵌套列表理解。

Demo: 演示:

>>> list1 = ['A', 'B']
>>> list2 = ['Apple', 'Banana', 'Balloon', 'Boxer', 'Crayons', 'Elephant']
>>> list2_map = {}
>>> for word in list2:
...     list2_map.setdefault(word[0], []).append(word)
...
>>> ['{}:{}'.format(l, ','.join(list2_map.get(l, []))) for l in list1]
['A:Apple', 'B:Banana,Balloon,Boxer']

The above algorithm loops twice through all of list2 , and once through list1 , making this a O(N) linear algorithm (adding a single word to list2 or a single letter to list1 increases the amount of time with a constant amount). 上述算法在list2所有循环中循环两次,并且一次通过list1 ,使其成为O(N)线性算法(向list2添加单个单词或向list1添加单个字母会增加具有恒定量的时间量)。 Your version loops over list2 once for every letter in list1 , making it a O(NM) algorithm, creating increasing the amount of time it takes exponentially whenever you add a letter or word. 对于list1每个字母,您的版本会循环遍历list2一次,使其成为O(NM)算法,每当您添加字母或单词时,都会增加指数级所需的时间。

To put that into numbers, if you expanded list1 to cover all 26 ASCII uppercase letters and expanded list2 to contain 1000 words, your approach (scanning all of list2 for words with a given letter) would make 26000 steps. 把它放到数字中,如果扩展list1以覆盖所有26个ASCII大写字母并扩展list2以包含1000个单词,那么你的方法(扫描所有list2以获得给定字母的单词)将会产生26000步。 My version, including pre-building the map, takes only 2026 steps. 我的版本,包括预先构建地图,只需要2026步。 With list2 containing 1 million words, your version has to make 26 million steps, mine 2 million and 26. 随着list2包含100万字,你的版本必须制作2600万步,我的200万和26。

list1=["A","B"]
list2=["Apple","Banana","Balloon","Boxer","Crayons","Elephant"]

res = [l1 + ':' + ','.join(l2 for l2 in list2 if l2.startswith(l1)) for l1 in list1]
print(res)

# ['A:Apple', 'B:Banana,Balloon,Boxer']

But it seems to be complicated to read, so I would advice to use nested loops. 但是阅读起来似乎很复杂,所以我建议使用嵌套循环。 You can create generator for more readability (if you think this version is more readable): 您可以创建生成器以提高可读性(如果您认为此版本更具可读性):

def f(list1, list2):
    for l1 in list1:
        val = ','.join(l2 for l2 in list2 if l2.startswith(l1))
        yield l1 + ':' + val

print(list(f(list1, list2)))

# ['A:Apple', 'B:Banana,Balloon,Boxer']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM