简体   繁体   English

匹配两个列表字段 python 的最快方法

[英]Fastest way to match two list fields python

I have an issue with time in my latest python script.我在最新的 python 脚本中遇到了时间问题。 In essence, i have two lists, eg List1: ([a,1],[b,2]) List2: ([a,3],[b,4])本质上,我有两个列表,例如 List1: ([a,1],[b,2]) List2: ([a,3],[b,4])

Now in the example above i have provided two entries in each list.现在在上面的示例中,我在每个列表中提供了两个条目。 However, in reality there is about 150,000.然而,实际上大约有 150,000 个。

In my current script I retrieve the first field from the first list [a] and loop through the entire List2 till there is a match.在我当前的脚本中,我从第一个列表[a]中检索第一个字段并遍历整个 List2 直到匹配。 The two list entries are then appended.然后附加两个列表条目。

The final result would be: ([a,1,3],[b,2,4])最终结果将是: ([a,1,3],[b,2,4])

However, given the size of the lists this is taking forever.但是,鉴于列表的大小,这将永远持续下去。

Is there a way i can use the field of list1 [a] and in constant time retrieve all entries in list2 that have [a]有没有办法我可以使用 list1 [a]的字段并在恒定时间内检索 list2 中具有[a]所有条目

I have seen some answers online suggesting sets, but i am unsure as to how to implement one and use it to solve the solution above.我在网上看到了一些建议集的答案,但我不确定如何实现一个并使用它来解决上述解决方案。

Any help would be appreciated.任何帮助,将不胜感激。

Further example:进一步的例子:

l1=(['abc123','hi'], ['efg456','bye']) - l1 has around 2000 tuples l1=(['abc123','hi'], ['efg456','bye']) - l1 有大约 2000 个元组

l2=(['abc123','letter'],['abc123','john'],['abc123','leaf']) - l2 has around 100,000+ tuples l2=(['abc123','letter'],['abc123','john'],['abc123','leaf']) - l2 有大约 100,000+ 个元组

Output: l3=(['abc123','hi','letter'],['abc123','hi','john'],['abc123','hi','leaf']) Output: l3=(['abc123','hi','letter'],['abc123','hi','john'],['abc123','hi','leaf'])

If your a and b values are unique, you can convert the "lists" (what you have is actually a tuple of lists, not a list of lists) into dictionaries and then merge them.如果您ab值是唯一的,您可以将“列表”(您拥有的实际上是列表元组,而不是列表列表)转换为字典,然后将它们合并。 For example:例如:

l1 = (['a', 1], ['b', 2], ['c', 5])
l2 = (['a', 3], ['b', 4])

d1 = { k : [v] for [k, v] in l1 }
d2 = { k : [v] for [k, v] in l2 }

for k in d1.keys():
    d1[k] += d2.get(k, [])
    
print(d1)

Output: Output:

{'a': [1, 3], 'b': [2, 4], 'c': [5]}

You can convert that dictionary back to a tuple of lists using a comprehension:您可以使用推导将该字典转换回列表元组:

print(tuple([k, *v] for k, v in d1.items()))

Output: Output:

(['a', 1, 3], ['b', 2, 4], ['c', 5])

Not so hard, just use a dict for list1 and a for loop for list2.没那么难,只需对 list1 使用一个 dict,对 list2 使用一个 for 循环。

dict1 = {key1: [value1] for key1, value1 in list1}  # convert list1 to dict
                                                    # and the values should be converted to dict
for key2, value2 in list2:
    try:
        dict1[key2].append(value2)
    except KeyError:
        continue  # I'm not sure what do you want to do if the keys in list2 didn't exist in list1, so just ignore them
list3 = tuple([key3, *value3] for key3, value3 in dict1.items())
print(list3)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM