简体   繁体   中英

Sum Python dictionary values by key

I am trying to summarize data from a health app by date. Each date has multiple entries so I've created a single dictionary that has each unique date as a key (column index 1 ), and I want to add the total amount of fat (column index 7 ) for each date as a value.

I am new to Python and trying to do this in pure Python rather than with NumPy etc. Any help is much appreciated.

['18600018', '05-31-2020', 'Dinner', 'salmon', '1 serving', '210.0000000005', '-0.0694999987329796', '14.000000004', '2.999999996', '', '', '', '54.9999999975', '469.9999999995', '', '', '', '', '', '', '20.9999999975', '', '', '', '4.799999997', '', '', '', '', '', '', '', '', '', '', '', '0.3599999985', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']

So far I have this for loop to increment the dictionary and am getting the following error:

fat_dict = {}
for row in data:
    date = row[1]
    fat = row[7]
    if date in fat_dict:
        fat_dict[date] = fat
    else:
        fat_dict[date] += fat


KeyError                                  Traceback (most recent call last)
<ipython-input-3-dfbce568de95> in <module>
     80         fat_dict[date] = fat
     81     else:
---> 82         fat_dict[date] += fat
     83 
     84 

KeyError: '05-31-2020'

The ideal outcome would be each unique date (key) with sum of fat for that date (value).

Here it is and its working

data=[['18600018', '05-31-2020', 'Dinner', 'salmon', '1 serving', '210.0000000005', '-0.0694999987329796', '14.000000004', '2.999999996', '', '', '', '54.9999999975', '469.9999999995', '', '', '', '', '', '', '20.9999999975', '', '', '', '4.799999997', '', '', '', '', '', '', '', '', '', '', '', '0.3599999985', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''],['18600018', '05-31-2020', 'Dinner', 'salmon', '1 serving', '210.0000000005', '-0.0694999987329796', '15.000000004', '2.999999996', '', '', '', '54.9999999975', '469.9999999995', '', '', '', '', '', '', '20.9999999975', '', '', '', '4.799999997', '', '', '', '', '', '', '', '', '', '', '', '0.3599999985', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']]
fat_dict = {}
for row in data:
    date = row[1]
    fat = row[7]
    if date in fat_dict:
       fat_dict[date] +=float(fat)
    else:
        fat_dict[date]=float(fat)
print(fat_dict)

OUTPUT:

{'05-31-2020': 29.000000008}

The other answers have pointed out your mistake of in vs not in . Instead of a dict you could consider defaultdict or Counter from Collections in the Standard Python Library to simplify your code.


example using defaultdict

from collections import defaultdict

fat_dict = defaultdict(float) #default value will be 'float()' => 0.0
for row in data:
    date = row[1]
    fat = float(row[7])

    fat_dict[date]+=fat 

example using Counter

from collections import Counter

fat_counter = Counter()
for row in data:
    date = row[1]
    fat = float(row[7])

    fat_counter[date]+=fat  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM