匹配两个列表字段 python 的最快方法

Question

I have an issue with time in my latest python script.我在最新的 python 脚本中遇到了时间问题。 In essence, i have two lists, eg List1: ([a,1],[b,2]) List2: ([a,3],[b,4])本质上，我有两个列表，例如 List1: ([a,1],[b,2]) List2: ([a,3],[b,4])

Now in the example above i have provided two entries in each list.现在在上面的示例中，我在每个列表中提供了两个条目。 However, in reality there is about 150,000.然而，实际上大约有 150,000 个。

In my current script I retrieve the first field from the first list [a] and loop through the entire List2 till there is a match.在我当前的脚本中，我从第一个列表[a]中检索第一个字段并遍历整个 List2 直到匹配。 The two list entries are then appended.然后附加两个列表条目。

The final result would be: ([a,1,3],[b,2,4])最终结果将是： ([a,1,3],[b,2,4])

However, given the size of the lists this is taking forever.但是，鉴于列表的大小，这将永远持续下去。

Is there a way i can use the field of list1 [a] and in constant time retrieve all entries in list2 that have [a]有没有办法我可以使用 list1 [a]的字段并在恒定时间内检索 list2 中具有[a]所有条目

I have seen some answers online suggesting sets, but i am unsure as to how to implement one and use it to solve the solution above.我在网上看到了一些建议集的答案，但我不确定如何实现一个并使用它来解决上述解决方案。

Any help would be appreciated.任何帮助，将不胜感激。

Further example:进一步的例子：

l1=(['abc123','hi'], ['efg456','bye']) - l1 has around 2000 tuples l1=(['abc123','hi'], ['efg456','bye']) - l1 有大约 2000 个元组

l2=(['abc123','letter'],['abc123','john'],['abc123','leaf']) - l2 has around 100,000+ tuples l2=(['abc123','letter'],['abc123','john'],['abc123','leaf']) - l2 有大约 100,000+ 个元组

Output: l3=(['abc123','hi','letter'],['abc123','hi','john'],['abc123','hi','leaf']) Output: l3=(['abc123','hi','letter'],['abc123','hi','john'],['abc123','hi','leaf'])

Answer 1

If your a and b values are unique, you can convert the "lists" (what you have is actually a tuple of lists, not a list of lists) into dictionaries and then merge them.如果您a和b值是唯一的，您可以将“列表”（您拥有的实际上是列表元组，而不是列表列表）转换为字典，然后将它们合并。 For example:例如：

l1 = (['a', 1], ['b', 2], ['c', 5])
l2 = (['a', 3], ['b', 4])

d1 = { k : [v] for [k, v] in l1 }
d2 = { k : [v] for [k, v] in l2 }

for k in d1.keys():
    d1[k] += d2.get(k, [])
    
print(d1)

Output: Output：

{'a': [1, 3], 'b': [2, 4], 'c': [5]}

You can convert that dictionary back to a tuple of lists using a comprehension:您可以使用推导将该字典转换回列表元组：

print(tuple([k, *v] for k, v in d1.items()))

Output: Output：

(['a', 1, 3], ['b', 2, 4], ['c', 5])

Answer 2

Not so hard, just use a dict for list1 and a for loop for list2.没那么难，只需对 list1 使用一个 dict，对 list2 使用一个 for 循环。

dict1 = {key1: [value1] for key1, value1 in list1}  # convert list1 to dict
                                                    # and the values should be converted to dict
for key2, value2 in list2:
    try:
        dict1[key2].append(value2)
    except KeyError:
        continue  # I'm not sure what do you want to do if the keys in list2 didn't exist in list1, so just ignore them
list3 = tuple([key3, *value3] for key3, value3 in dict1.items())
print(list3)

匹配两个列表字段 python 的最快方法

问题描述

2 个解决方案

解决方案1
2 2020-06-29 01:15:33

解决方案2
2 已采纳 2020-06-29 01:28:00

匹配两个列表字段 python 的最快方法

问题描述

2 个解决方案

解决方案1 2 2020-06-29 01:15:33

解决方案2 2 已采纳 2020-06-29 01:28:00

解决方案1
2 2020-06-29 01:15:33

解决方案2
2 已采纳 2020-06-29 01:28:00