Sum values grouped by key in list of dict

Question

I have a list of dicts and now I am trying to find the total jobs for each remote identifier.

In this case I am expecting for the id 64 -> 11 jobs and 68 -> 0 jobs

[{
    'jobs': {
        'count': 4
    },
    'remote_identifier': {
        'id': '64'
    }
}, {
    'jobs': {
        'count': 0
    },
    'remote_identifier': {
        'id': '68'
    }
}, {
    'jobs': {
        'count': 7
    },
    'remote_identifier': {
        'id': '64'
    }
}]

I already tried something like this, but I don't know how to adapt it to my needs, since that only counts the number of occurrences.

from collections import Counter
print Counter(item['remote_identifier']['id'] for item in items )

Answer 1

Pretty straight forward with a defaultdict . ( data is your original list.)

>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> 
>>> for d_inner in data:
...     id_ = d_inner['remote_identifier']['id']
...     d[int(id_)] += d_inner['jobs']['count']
... 
>>> d
defaultdict(<type 'int'>, {64: 11, 68: 0})

Answer 2

You can use a defaultdict to add up the counts:

from collections import defaultdict

jobs = [{
    'jobs': {
        'count': 4
    },
    'remote_identifier': {
        'id': '64'
    }
}, {
    'jobs': {
        'count': 0
    },
    'remote_identifier': {
        'id': '68'
    }
}, {
    'jobs': {
        'count': 7
    },
    'remote_identifier': {
        'id': '64'
    }
}]

counts = defaultdict(int)

for job in jobs:
    counts[job['remote_identifier']['id']] += job['jobs']['count']

print(counts)

Output:

defaultdict(<class 'int'>, {'64': 11, '68': 0})

Answer 3

The simplest way is by using the itertools module, which provides the function groupby .

import itertools as it

def get_id(entry):
    return entry['remote_identifier']['id']

data.sort(key=get_id)
for key, group in it.groupby(data, get_id):
    print(key, sum(entry['jobs']['count'] for entry in group))

Note that groupby assumes that the data is already sorted by the key you are using to group the elements in the data.

Answer 4

This should do the trick:

result = {}
for i in items:
    ri = i['remote_identifier']['id']
    j = i['jobs']['count']
    if ri in result:
        result[ri] += j
    else:
        result[ri] = j
result
#{'68': 0, '64': 11}

Answer 5

Another solution is as follows:

input = [{
    'jobs': {
        'count': 4
    },
    'remote_identifier': {
        'id': '64'
    }
}, {
    'jobs': {
        'count': 0
    },
    'remote_identifier': {
        'id': '68'
    }
}, {
    'jobs': {
        'count': 7
    },
    'remote_identifier': {
        'id': '64'
    }
}]

res = dict()
for item in input:

    if item['remote_identifier']['id'] in res:
        total = res[item['remote_identifier']['id']] + item['jobs']['count']
    else:
        total = item['jobs']['count']
    res.update({item['remote_identifier']['id']: total})

print res

output:

{'68': 0, '64': 11}

Sum values grouped by key in list of dict

Question

5 answers

solution1
3 ACCPTED 2017-12-07 10:06:59

solution2
1 2017-12-07 10:03:24

solution3
1 2017-12-07 10:15:10

solution4
0 2017-12-07 10:03:46

solution5
0 2017-12-07 10:14:17

Sum values grouped by key in list of dict

Question

5 answers

solution1 3 ACCPTED 2017-12-07 10:06:59

solution2 1 2017-12-07 10:03:24

solution3 1 2017-12-07 10:15:10

solution4 0 2017-12-07 10:03:46

solution5 0 2017-12-07 10:14:17

solution1
3 ACCPTED 2017-12-07 10:06:59

solution2
1 2017-12-07 10:03:24

solution3
1 2017-12-07 10:15:10

solution4
0 2017-12-07 10:03:46

solution5
0 2017-12-07 10:14:17