简体   繁体   中英

How to compare two lists of dicts in Python?

How do I compare two lists of dict ? The result should be the odd ones out from the list of dict B.

Example:

ldA = [{'user':"nameA", 'a':7.6, 'b':100.0, 'c':45.5, 'd':48.9},
       {'user':"nameB", 'a':46.7, 'b':67.3, 'c':0.0, 'd':5.5}]


ldB =[{'user':"nameA", 'a':7.6, 'b':99.9, 'c':45.5, 'd':43.7},
      {'user':"nameB", 'a':67.7, 'b':67.3, 'c':1.1, 'd':5.5},
      {'user':"nameC", 'a':89.9, 'b':77.3, 'c':2.2, 'd':6.5}]

Here I want to compare ldA with ldB. It should print the below output.

ldB -> {user:"nameA",  b:99.9, d:43.7}
ldB -> {user:"nameB",  a:67.7, c:1.1 }
ldb -> {user:"nameC", a:89.9, b:77.3, c:2.2, d:6.5}

I have gone through the below link, but there it return onlys the name, but I want name and value like above.

List of Dicts comparision to match between lists and detect value changes in Python

For a general solution, consider the following. It will properly diff, even if the users are out of order in the lists.

def dict_diff ( merge, lhs, rhs ):
    """Generic dictionary difference."""
    diff = {}
    for key in lhs.keys():
          # auto-merge for missing key on right-hand-side.
        if (not rhs.has_key(key)):
            diff[key] = lhs[key]
          # on collision, invoke custom merge function.
        elif (lhs[key] != rhs[key]):
            diff[key] = merge(lhs[key], rhs[key])
    for key in rhs.keys():
          # auto-merge for missing key on left-hand-side.
        if (not lhs.has_key(key)):
            diff[key] = rhs[key]
    return diff

def user_diff ( lhs, rhs ):
    """Merge dictionaries using value from right-hand-side on conflict."""
    merge = lambda l,r: r
    return dict_diff(merge, lhs, rhs)

import copy

def push ( x, k, v ):
    """Returns copy of dict `x` with key `k` set to `v`."""
    x = copy.copy(x); x[k] = v; return x

def pop ( x, k ):
    """Returns copy of dict `x` without key `k`."""
    x = copy.copy(x); del x[k]; return x

def special_diff ( lhs, rhs, k ):
      # transform list of dicts into 2 levels of dicts, 1st level index by k.
    lhs = dict([(D[k],pop(D,k)) for D in lhs])
    rhs = dict([(D[k],pop(D,k)) for D in rhs])
      # diff at the 1st level.
    c = dict_diff(user_diff, lhs, rhs)
      # transform to back to initial format.
    return [push(D,k,K) for (K,D) in c.items()]

Then, you can check the solution:

ldA = [{'user':"nameA", 'a':7.6, 'b':100.0, 'c':45.5, 'd':48.9},
       {'user':"nameB", 'a':46.7, 'b':67.3, 'c':0.0, 'd':5.5}]
ldB =[{'user':"nameA", 'a':7.6, 'b':99.9, 'c':45.5, 'd':43.7},
      {'user':"nameB", 'a':67.7, 'b':67.3, 'c':1.1, 'd':5.5},
      {'user':"nameC", 'a':89.9, 'b':77.3, 'c':2.2, 'd':6.5}]
import pprint
if __name__ == '__main__':
    pprint.pprint(special_diff(ldA, ldB, 'user'))

My approach: build a lookup based on ldA of values to exclude, then determine the result of excluding the appropriate values from each list in ldB.

lookup = dict((x['user'], dict(x)) for x in ldA)
# 'dict(x)' is used here to make a copy
for v in lookup.values(): del v['user']

result = [
    dict(
        (k, v)
        for (k, v) in item.items()
        if item['user'] not in lookup or lookup[item['user']].get(k, v) == v
    )
    for item in ldB
]

You should, however, be aware that comparing floating-point values like that can't be relied upon .

I am going to assume that the corresponding dict s are in the same order in both lists.

Under that assumption, you can use the following code:

def diffs(L1, L2):
    answer = []
    for i, d1 in enumerate(L1):
        d = {}
        d2 = L2[i]
        for key in d1:
            if key not in d1:
                print key, "is in d1 but not in d2"
            elif d1[key] != d2[key]:
                d[key] = d2[key]
        answer.append(d)
    return answer

Untested. Please comment if there are errors and I will fix them

One more solution a bit weird(sorry if i miss something) but it also allows you to configure your own equality check(you simply need to modify isEqual lambda for this) as well as give you two different options on how to deal in case when keys differ:

ldA = [{'user':"nameA", 'a':7.6, 'b':100.0, 'c':45.5, 'd':48.9},
       {'user':"nameB", 'a':46.7, 'b':67.3, 'c':0.0, 'd':5.5}]


ldB =[{'user':"nameA", 'a':7.6, 'b':99.9, 'c':45.5, 'd':43.7},
      {'user':"nameB", 'a':67.7, 'b':67.3, 'c':1.1, 'd':5.5},
      {'user':"nameC", 'a':89.9, 'b':77.3, 'c':2.2, 'd':6.5}]

ldA.extend((ldB.pop() for i in xrange(len(ldB)))) # get the only one list here

output = []

isEqual = lambda x,y: x != y # add your custom equality check here, for example rounding values before comparison and so on

while len(ldA) > 0: # iterate through list
    row = ldA.pop(0) # get the first element in list and remove it from list
    for i, srow in enumerate(ldA):
        if row['user'] != srow['user']:
            continue
        res = {'user': srow['user']} #
        # next line will ignore all keys of srow which are not in row 
        res.update(dict((key,val) for key,val in ldA.pop(i).iteritems() if key in row and isEqual(val, row[key])))
        # next line will include the srow.key and srow.value into the results even in a case when there is no such pair in a row
        #res.update(dict(filter(lambda d: isEqual(d[1], row[d[0]]) if d[0] in row else True ,ldA.pop(i).items())))
        output.append(res)
        break
    else:
        output.append(row)

print output

I wrote this tool a while back, it can currently cope with nested lists, dicts and sets. Gives you a terser output (the . in . > i:1 > 'c' refers to the top level and the i:1 refers to index 1 of the list being compared):

compare(ldA, ldB)
. > i:0 > 'b' dict value is different:
100.0
99.9

. > i:0 > 'd' dict value is different:
48.9
43.7

. > i:1 > 'a' dict value is different:
46.7
67.7

. > i:1 > 'c' dict value is different:
0.0
1.1

. lists differed at positions: 2
['<not present>']
[{'c': 2.2, 'd': 6.5, 'a': 89.9, 'user': 'nameC', 'b': 77.3}]

This definitely takes some assumptions from your sample data, mainly that there will not be users in ldA that are not in ldB , if this is an invalid assumption let me know.

You would call this like dict_diff(ldA, ldB, user) .

def dict_diff(ldA, ldB, key):
    for i, dA in enumerate(ldA):
        d = {key: dA[key]}
        d.update(dict((k, v) for k, v in ldB[i].items() if v != dA[k]))
        print "ldB -> " + str(d)
    for dB in ldB[i+1:]:
        print "ldB -> " + str(dB)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM