简体   繁体   中英

Finding the intersection of a list of dictionaries based on an arbitrary field

Say I have a list of dictionaries l1 and l2 . Each contains dictionaries of the same format. I'd like to find the intersection of l1 and l2 based on some field of the dictionary.

For example, let

l1 = [{"key":1, "key2":2}, {"key":1, "key2":0}],
l2 = [{"key":0, "key2":2}]. 

I'd like to intersect them based on "key2". So, l1.intersect(l2) = 2 .

I can do this as follows which has the complexity of O(len(l1) + len(l2)) if I'm not mistaken.

d = defaultdict(bool)
for e in l2:
    d[e['key2']] = True
intersection=set()
for e in l1:
    if d[e['key2']]:
        intersection.add(e['key2])

What I wonder is if there exists a better solution or if my solution is already optimal.

You can make this compact by using set comprehensions. For example,

l1 = [{"key":1, "key2":2}, {"key":3, "key2":4}, {"key":5, "key2":6}, {"key":7, "key2":8}]
l2 = [{"key":0, "key2":2}, {"key":1, "key2":3}, {"key":2, "key2":4}]

key = "key2"
values = {d[key] for d in l1} & {d[key] for d in l2}
print(values)

output

{2, 4}

You can make the code a little more readable by doing the set comprehension in a function, although the function calls will make the code microscopically slower.

def key_set(seq, key):
    return {d[key] for d in seq}

values = key_set(l1, key) & key_set(l2, key)

This technique can be generalised to handle any number of lists.

all_lists = (l1, l2)
key = "key2"
values = set.intersection(*({d[key] for d in seq} for seq in all_lists))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM