简体   繁体   English

使用两个列表优化嵌套的for循环

[英]Optimizing a nested for loop with two lists

I have a program that searches through two separate lists, lets call them list1 and list2. 我有一个程序可以搜索两个单独的列表,可以将它们称为list1和list2。

I only want to print the instances where list1 and list2 have matching items. 我只想打印list1和list2具有匹配项的实例。 The thing is, not all items in both lists match eachother, but the first, third and fourth items should. 问题是,并非两个列表中的所有项目都相互匹配,但是第一,第三和第四项应该匹配。

If they match, I want the complete lists (including the mismatching items) to be appended to two corresponding lists. 如果它们匹配,我希望将完整列表(包括不匹配项)附加到两个对应的列表中。

I have written the follow code: 我写了以下代码:

for item in list1:
    for item2 in list2:
        if (item[0] and item[2:4])==(item[0] and item2[2:4]):
            newlist1.append(item)
            newlist2.append(item2)
            break

This works, but it's quite inefficient. 可以,但是效率很低。 For some of the larger files I'm looking through it can take more than 10 seconds to complete the match, and it should ideally be at most half of that. 对于我正在浏览的一些较大文件,完成比赛可能需要10秒钟以上的时间,理想情况下,最多应该花一半的时间。

What I'm thinking is that it shouldn't have to start over from the beginning in list2 each time the code is run, it should be enough to continue from the last point where there was a match. 我在想的是,不必在每次运行代码时都从list2的开头重新开始,它应该足以从存在匹配项的最后一点继续。 But I don't know how to write it in code. 但是我不知道如何用代码编写它。

Your condition (item[0] and item[2:4])==(item[0] and item2[2:4]) is wrong. 您的条件(item[0] and item[2:4])==(item[0] and item2[2:4])是错误的。

Besides that the second item[0] should probably be item2[0] , what (item[0] and item[2:4]) does is the following (analogously for (item2[0] and item2[2:4]) ): 除了第二个item[0]可能应该是item2[0](item[0] and item[2:4])所做的是以下操作(类似于(item2[0] and item2[2:4]) ):

  • if item[0] is 0 , it returns item[0] itself, ie 0 如果item[0]0 ,则返回item[0]本身,即0
  • if item[0] is not 0 , it returns whatever item[2:4] is 如果item[0]不为0 ,则返回item[2:4]

And this is then compared to the result of the second term. 然后将其与第二项的结果进行比较。 Thus, [0,1,1,1] would "equal" [0,2,2,2] , and [1,1,1,1] would "equal" [2,1,1,1] . 因此, [0,1,1,1]将“等于” [0,2,2,2] ,而[1,1,1,1]将“等于” [2,1,1,1]

Try using tuples instead: 尝试改用tuples

if (item[0], item[2:4]) == (item2[0], item2[2:4]):

Or use operator.itemgetter as suggested in the other answer. 或按照其他答案中的建议使用operator.itemgetter


To speed up the pairwise matching of items from both lists, put the items from the first list into a dictionary, using those tuples as key, and then iterating over the other list and looking up the matching items in the dictionary. 为了加快两个列表中项的成对匹配,请将第一个列表中的项放入字典中,使用这些元组作为键,然后遍历其他列表并在字典中查找匹配项。 Complexity will be O(n+m) instead of O(n*m) ( n and m being the length of the lists). 复杂度将是O(n + m)而不是O(n * m)nm是列表的长度)。

key = operator.itemgetter(0, 2, 3)

list1_dict = {}
for item in list1:
    list1_dict.setdefault(key(item), []).append(item)

for item2 in list2:
    for item in list1_dict.get(key(item2), []):
        newlist1.append(item)
        newlist2.append(item2)
from operator import itemgetter

getter = itemgetter(0, 2, 3)
for item,item2 in zip(list1, list2):
    if getter(item) == getter(item2):
        newlist1.append(item)
        newlist2.append(item2)
        break

This may reduce bit of time complexity though... 但这可能会减少时间复杂度...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM