简体   繁体   中英

Averages across a dictionary with tuple keys

I am trying to find the averages for the values of a dictionary by city. For the purposes of this exercise I cannot use numpy or pandas .

Here is some example data:

d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3)
}

Here is the ideal output:

city_averages = {
    'Chicago': 48.4,
    'Dallas': 70.8,
    'Paris': 139.7
    }

Here is the code I tried.

city_averages = {}


total = 0
for k,v in d.items():
    total += float(v) 
    city_averages[k[0]] = total 
     
    

There is very similar question on here

In your case, the code is as following:

from collections import defaultdict
import statistics

d = {
    ('Chicago', 2006): 23.4,
    ('Chicago', 2007): 73.4,
    ('Dallas', 2008): 70.8,
    ('Paris', 2010): 5.6,
    ('Paris', 2011): 63.3
}

grouper = defaultdict(list)

for k, v in d.items():
    grouper[k[0]].append(v)

city_averages = {k: statistics.mean(v) for k,v in grouper.items()}
print(city_averages)

You can do something more simple like this:

d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3,
('Paris', 2011): 100.4
}

dnew = {}
for k,v in d.items():
    if k[0] in dnew:
        dnew[k[0]] += v 
    else:
        dnew[k[0]] = v

print (dnew)

you will get an output as follows:

{'Chicago': 96.80, 'Dallas': 70.8, 'Paris': 169.3}

You will need to format the data before you print them.

I will leave you to figure out the logic for finding the average. This should help you get closer to the full answer.

answer with average calculation:

Here's the code that includes calculation for average. This does not use any complicated logic.

dnew = {}
dcnt = {}

for k,v in d.items():
    dnew[k[0]] = dnew.get(k[0], 0) + v
    dcnt[k[0]] = dcnt.get(k[0], 0) + 1

for k,v in dnew.items():
    dnew[k] /= dcnt[k]

print (dnew)

The output will be as follows:

{'Chicago': 48.400000000000006, 'Dallas': 70.8, 'Paris': 56.43333333333334}

Next I provide two versions of one-liner codes - first simple one using itertools.groupby and second more complex without usage of any extra modules.

Try it online!

import itertools

d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3,
}

print({k : sum(e[1] for e in lg) / len(lg) for k, g in itertools.groupby(sorted(d.items()), lambda e: e[0][0]) for lg in (list(g),)})

Next fancy one-liner code I've created without using any modules (like itertools), just plain python, it is as efficient in terms of time complexity as code above with itertools.groupby. This code is just for recreational purpose or when you really need one-liner without usage of any modules:

Try it online!

d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3,
}

print({k : sm / cnt  for sd in (sorted(d.items()),) for i in range(len(sd)) for k, cnt, sm in ((sd[i][0][0] if i + 1 >= len(sd) or sd[i][0][0] != sd[i + 1][0][0] else None,) + ((1, sd[i][1]) if i == 0 or sd[i - 1][0][0] != sd[i][0][0] else (cnt + 1, sm + sd[i][1])),) if k is not None})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM