I have a list of dictionaries as follows
[
{'sex': 2, 'newspaper_sheet__country': 'ML', 'n': 7},
{'sex': 1, 'newspaper_sheet__country': 'ML', 'n': 5},
{'sex': 2, 'newspaper_sheet__country': 'ML', 'n': 10}
]
I then have 2 Counters
from collections import Counter
counts = Counter()
counts1 = Counter()
I'm updating the two counters in the following formats
for row in rows:
counts.update({(row['sex'], row['newspaper_sheet__country']): row['n']})
and
counts1.update({(row['sex'], row['newspaper_sheet__country']): row['n'] for row in rows})
I would expect the values of the 2 counts to be the same since the only difference is 1 is using a for loop and the other one a dict comprehension.
Why are the 2 values different?
By calling Counter.update
in each iteration of a for
loop the Counter
object would get updated with the input dict for each call.
With a dict comprehension, key-values get aggregated into a dict first before getting passed to Counter.update
. Since latter values of duplicating keys in a dict comprehension would override the preceding values of the same keys, the value 10
of the key (2, 'ML')
overrides the value 7
of the same key, resulting in the Counter
object not accounting for the value 7
in the end.
Because calling .update
in a loop like that is not equivalent to passing the result of that dictionary comprehension, look what that dictionary comprehension creates:
>>> rows = [
... {'sex': 2, 'newspaper_sheet__country': 'ML', 'n': 7},
... {'sex': 1, 'newspaper_sheet__country': 'ML', 'n': 5},
... {'sex': 2, 'newspaper_sheet__country': 'ML', 'n': 10}
... ]
>>> {(row['sex'], row['newspaper_sheet__country']): row['n'] for row in rows}
{(2, 'ML'): 10, (1, 'ML'): 5}
Dictionaries have unique keys, and the last item seen is kept.
The difference is because of the way update()
is done with list comprehension.
With for loop based approach, the counter is updated each time and aggregates the count for the matching keys but with the list comprehension approach, it is only getting a dictionary with unique keys.
The list comprehension approach can be broken down as:
dic = {(row['sex'], row['newspaper_sheet__country']): row['n'] for row in rows}
print(dic) # dic only contains unique key value pairs here
counts1.update(dic)
So, counts1
is updated just once while counts
is updated multiple times due to the loop based approach.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.