I have a list of dicts and now I am trying to find the total jobs for each remote identifier.
In this case I am expecting for the id 64 -> 11 jobs
and 68 -> 0 jobs
[{
'jobs': {
'count': 4
},
'remote_identifier': {
'id': '64'
}
}, {
'jobs': {
'count': 0
},
'remote_identifier': {
'id': '68'
}
}, {
'jobs': {
'count': 7
},
'remote_identifier': {
'id': '64'
}
}]
I already tried something like this, but I don't know how to adapt it to my needs, since that only counts the number of occurrences.
from collections import Counter
print Counter(item['remote_identifier']['id'] for item in items )
Pretty straight forward with a defaultdict
. ( data
is your original list.)
>>> from collections import defaultdict
>>> d = defaultdict(int)
>>>
>>> for d_inner in data:
... id_ = d_inner['remote_identifier']['id']
... d[int(id_)] += d_inner['jobs']['count']
...
>>> d
defaultdict(<type 'int'>, {64: 11, 68: 0})
You can use a defaultdict to add up the counts:
from collections import defaultdict
jobs = [{
'jobs': {
'count': 4
},
'remote_identifier': {
'id': '64'
}
}, {
'jobs': {
'count': 0
},
'remote_identifier': {
'id': '68'
}
}, {
'jobs': {
'count': 7
},
'remote_identifier': {
'id': '64'
}
}]
counts = defaultdict(int)
for job in jobs:
counts[job['remote_identifier']['id']] += job['jobs']['count']
print(counts)
Output:
defaultdict(<class 'int'>, {'64': 11, '68': 0})
The simplest way is by using the itertools
module, which provides the function groupby
.
import itertools as it
def get_id(entry):
return entry['remote_identifier']['id']
data.sort(key=get_id)
for key, group in it.groupby(data, get_id):
print(key, sum(entry['jobs']['count'] for entry in group))
Note that groupby
assumes that the data is already sorted by the key you are using to group the elements in the data.
This should do the trick:
result = {}
for i in items:
ri = i['remote_identifier']['id']
j = i['jobs']['count']
if ri in result:
result[ri] += j
else:
result[ri] = j
result
#{'68': 0, '64': 11}
Another solution is as follows:
input = [{
'jobs': {
'count': 4
},
'remote_identifier': {
'id': '64'
}
}, {
'jobs': {
'count': 0
},
'remote_identifier': {
'id': '68'
}
}, {
'jobs': {
'count': 7
},
'remote_identifier': {
'id': '64'
}
}]
res = dict()
for item in input:
if item['remote_identifier']['id'] in res:
total = res[item['remote_identifier']['id']] + item['jobs']['count']
else:
total = item['jobs']['count']
res.update({item['remote_identifier']['id']: total})
print res
output:
{'68': 0, '64': 11}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.