简体   繁体   中英

Python: comparison of two dict lists

Here is what I want to achieve:

I have got two lists of dictionaries. All the dictionaries have the following structure:

dictinary = {'name':'MyName', 'state':'MyState'}

I would like to go through all the elements of both lists and compare the states of the entries with the same name. Here is the best way that I can imagine:

for d in list1:
    name = d['name']
    for d2 in list2:
        if d2['name'] == name:
           if d1['state'] != d2['state']:
               # Do something

While I think that this approach would work, I wonder whether there is a more efficient and/or elegant way to perform this operation. Thank you for your ideas!

have a look at product from itertools:

import itertools

xs = range(1,10)
ys = range(11,20)

zs = itertools.product(xs,ys)

list(zs)

[(1, 11), (1, 12), (1, 13), (1, 14), (1, 15), (1, 16), (1, 17), (1, 18), (1, 19), (2, 11), (2, 12), (2, 13), (2, 14), (2, 15), (2, 16), (2, 17), (2, 18), (2, 19), (3, 11), (3, 12), (3, 13), (3, 14), (3, 15), (3, 16), (3, 17), (3, 18), (3, 19), (4, 11), (4, 12), (4, 13), (4, 14), (4, 15), (4, 16), (4, 17), (4, 18), (4, 19), (5, 11), (5, 12), (5, 13), (5, 14), (5, 15), (5, 16), (5, 17), (5, 18), (5, 19), (6, 11), (6, 12), (6, 13), (6, 14), (6, 15), (6, 16), (6, 17), (6, 18), (6, 19), (7, 11), (7, 12), (7, 13), (7, 14), (7, 15), (7, 16), (7, 17), (7, 18), (7, 19), (8, 11), (8, 12), (8, 13), (8, 14), (8, 15), (8, 16), (8, 17), (8, 18), (8, 19), (9, 11), (9, 12), (9, 13), (9, 14), (9, 15), (9, 16), (9, 17), (9, 18), (9, 19)]

A couple of other things -

  1. when you are only representing two things, it is common to use a tuple (even a named tuple) so have a think about why they are dicts to begin with - you might have a great reason :)

[('name','state'),('name','state'),('name','state')...]

Another approach, would be to compare elements directly, for example you could check the intersection of setA (list of dicts 1) and setB (list of dicts 2)

>>> listA = [('fred','A'), ('bob','B'), ('mary', 'D'), ('eve', 'E')]
>>> listB = [('fred','X'), ('clive', 'C'), ('mary', 'D'), ('ben','B')]
# your listA and listB could be sets to begin with
>>> set.intersection(set(listA),set(listB)) 
set([('mary', 'D')])

this approach however does not allow for duplicates...

The most elegant way I can think of is a list comprehension.

[[do_something() for d1 in list1 if d1["name"] == d2["name"] and d1["state"] != d2["state"]] for d2 in list2]

But that's kind of the same code.

You can also make your sample code a bit more elegant by reducing it a bit:

for d in list1:
    for d2 in list2:
        if d2['name'] == d['name'] and d['state'] != d2['state']:
            # Do something

The other answers are functional (they deliver the correct answer), but won't perform well for large lists because they use nested iteration -- for lists of length N, the number of steps they use grows like N^2. This isn't a concern if the lists are small; but if the lists are big, the number of iterations would explode.

An alternate approach that keeps time complexity linear with N goes like this (being pretty verbose):

##
## sample data
data = list()
data.append( [
    dict(name='a', state='0'),
    dict(name='b', state='1'),
    dict(name='c', state='3'),
    dict(name='d', state='5'),
    dict(name='e', state='7'),
    dict(name='f', state='10'),
    dict(name='g', state='11'),
    dict(name='h', state='13'),
    dict(name='i', state='14'),
    dict(name='l', state='19'),
    ])
data.append( [
    dict(name='a', state='0'),
    dict(name='b', state='1'),
    dict(name='c', state='4'),
    dict(name='d', state='6'),
    dict(name='e', state='8'),
    dict(name='f', state='10'),
    dict(name='g', state='12'),
    dict(name='j', state='16'),
    dict(name='k', state='17'),
    dict(name='m', state='20'),
    ])

##
## coalesce lists to a single flat dict for searching
dCombined = {}
for d in data:
    dCombined = { i['name'] : i['state'] for i in d }

##
## to record mismatches
names = []

##
## iterate over lists -- individually / not nested
for d in data:
    for i in d:
        if i['name'] in dCombined and i['state'] != dCombined[i['name']]:
            names.append(i['name'])

##
## see result
print names

Caveats:

The OP didn't say if there could be repeated names within a list; that would change this approach a bit.

Depending on the details of "do something" you might record something other than justthe names -- could store references to or copies of the individual dict objects, or whatever "do something" requires.

The trade-off for this approach is that it requires more memory than the previous answers; however the memory requirement scales only with the number of actual mismatches, and is O(N).

Notes:

This approach also works when you have more than 2 lists to compare -- eg if there were 5 lists, my alternative is still O(N) in time and memory, while the previous answers would be O(N^5) in time!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM