简体   繁体   中英

Python - Averaging items in a list of lists

I have a list of lists like so

[[name1, 10.10], [name2, 12.12], [name1, 9.90], [name3, 22.20], [name3, 7.70]]

I want to search through the bigger list for the individual lists with the same first element, then average the second element, then append the new average value to a new list like so:

[[name1, 10.00], [name2, 12.12], [name3, 14.95]]

The problem is I don't know how to search through the lists to do so. I'm very new to python, can someone help?

You can use a dictionary to store every name with corresponding values :

>>> from __future__ import division
>>> l=[['name1', 10.1], ['name2', 12.12], ['name1', 9.9], ['name3', 22.2], ['name3', 7.70]]
>>> d={}
>>> for i in l:
...     d.setdefault(i[0],[]).extend(i[1:])
... 
>>> d
{'name2': [12.12], 'name3': [22.2, 7.7], 'name1': [10.1, 9.9]}
>>> [[i,sum(j)/len(j)] for i,j in d.items()]
[['name2', 12.12], ['name3', 14.95], ['name1', 10.0]]

Note that this answer works if you have more that 1 number in your sub lists!!

But for this case as i write before edit you can just do :

>>> from __future__ import division
>>> l=[['name1', 10.1], ['name2', 12.12], ['name1', 9.9], ['name3', 22.2], ['name3', 7.70]]
>>> d={}
>>> for i,j in l:
...     d.setdefault(i,[]).append(j)
... 
>>> d
{'name2': [12.12], 'name3': [22.2, 7.7], 'name1': [10.1, 9.9]}
>>> [[i,sum(j)/len(j)] for i,j in d.items()]
[['name2', 12.12], ['name3', 14.95], ['name1', 10.0]]

You can use a simple function to loop over the items:

def averageItems(items):
    averages = {}
    for name, data in items:
        averages.setdefault(name, []).append(data)
    for name, data in averages.items():
        averages[name] = sum(data) / len(data)
    return averages

Then use your list:

data = [[name1, 10.10], [name2, 12.12], [name1, 9.90], [name3, 22.20], [name3, 7.70]]
dataAverages = averageItems(data) # {name3: 14.95, name2: 12.12, name1: 10.0}
  1. Build a dictionary whose values are lists of numbers, using the .setdefault() method of dictionaries.
  2. Build a list using the builtins sum and len to compute the mean.

Using the ipython interpreter

In [1]: l = [['name1', 10.10], ['name2', 12.12], ['name1', 9.90], ['name3', 22.20], ['name3', 7.70]]
In [2]: d = {}
In [3]: for k, v in l: d[k]=d.setdefault(k,[])+[v]
In [4]: [[k,sum(d[k])/len(d[k])] for k in d]
Out[4]: [['name2', 12.12], ['name3', 14.95], ['name1', 10.0]]
In [5]: del d
In [6]: 

Prompted by Kevin's comment to the OP about the eventual requirement of conserving the order of labels in the original list, I'd suggest using an OrderedDict from the collections module

In [19]: from collections import OrderedDict
In [20]: d = OrderedDict()
In [21]: for k, v in l: d[k]=d.setdefault(k,[])+[v]
In [22]: [[k,sum(d[k])/len(d[k])] for k in d]
Out[22]: [['name1', 10.0], ['name2', 12.12], ['name3', 14.95]]

You can use a collections.defaultdict to store all the scores for each name in a single list and then use statistics.mean if you have python >= 3.4 to calculate the average:

from collections import defaultdict
from statistics import mean

l = [['name1', 10.10], ['name2', 12.12], ['name1', 9.90], ['name3', 22.20], ['name3', 7.70]]


details = defaultdict(list)

for name, score in l:
    details[name].append(score)

If you want to keep the dict structure just update the values:

for name, scores in details.items():
    details[name] = mean(scores)

print(details)
defaultdict(<class 'list'>, {'name3': 14.95, 'name1': 10.0, 'name2': 12.12})

Or create a list using a list comprehension:

print([[name ,mean(scores)] for name,scores in details.items()])
[['name1', 10.0], ['name3', 14.95], ['name2', 12.12]]

Obviously without using mean you can simple calcualte it yourself:

print([[name , sum(scores)/len(scores)] for name,scores in details.items()])

If the order matters then use a collections.OrderedDict :

from collections import OrderedDict
details = OrderedDict()

for name, score in l:
    details.setdefault(name,[])
    details[name].append(score)

print([[name , sum(scores)/len(scores)] for name,scores in details.items()])
from collections import defaultdict
from operator import add

d = defaultdict(list)
pairs = [[name1, 10.10], [name2, 12.12], [name1, 9.90], [name3, 22.20], [name3, 7.70]]

for name, val in pairs: 
    d[name].append(val)
print [(name, reduce(add, vals)/len(vals)) for name, vals in d.items()]

I think this should work, and it's fairly clean too. We create a defaultdict and append each value for each name to a list, then reduce those down by adding, and then divide by length to get an average.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM