简体   繁体   English

Python-在列表列表中平均项目

[英]Python - Averaging items in a list of lists

I have a list of lists like so 我有一个像这样的清单清单

[[name1, 10.10], [name2, 12.12], [name1, 9.90], [name3, 22.20], [name3, 7.70]]

I want to search through the bigger list for the individual lists with the same first element, then average the second element, then append the new average value to a new list like so: 我想在较大的列表中搜索具有相同第一个元素的单个列表,然后对第二个元素求平均值,然后将新的平均值附加到新列表中,如下所示:

[[name1, 10.00], [name2, 12.12], [name3, 14.95]]

The problem is I don't know how to search through the lists to do so. 问题是我不知道如何搜索列表。 I'm very new to python, can someone help? 我是python的新手,有人可以帮忙吗?

You can use a dictionary to store every name with corresponding values : 您可以使用字典来存储具有相应值的每个名称:

>>> from __future__ import division
>>> l=[['name1', 10.1], ['name2', 12.12], ['name1', 9.9], ['name3', 22.2], ['name3', 7.70]]
>>> d={}
>>> for i in l:
...     d.setdefault(i[0],[]).extend(i[1:])
... 
>>> d
{'name2': [12.12], 'name3': [22.2, 7.7], 'name1': [10.1, 9.9]}
>>> [[i,sum(j)/len(j)] for i,j in d.items()]
[['name2', 12.12], ['name3', 14.95], ['name1', 10.0]]

Note that this answer works if you have more that 1 number in your sub lists!! Note ,如果您的子列表中有多个数字,则此答案有效!!

But for this case as i write before edit you can just do : 但是对于这种情况,正如我在编辑之前写的那样,您可以执行以下操作:

>>> from __future__ import division
>>> l=[['name1', 10.1], ['name2', 12.12], ['name1', 9.9], ['name3', 22.2], ['name3', 7.70]]
>>> d={}
>>> for i,j in l:
...     d.setdefault(i,[]).append(j)
... 
>>> d
{'name2': [12.12], 'name3': [22.2, 7.7], 'name1': [10.1, 9.9]}
>>> [[i,sum(j)/len(j)] for i,j in d.items()]
[['name2', 12.12], ['name3', 14.95], ['name1', 10.0]]

You can use a simple function to loop over the items: 您可以使用一个简单的函数来遍历项目:

def averageItems(items):
    averages = {}
    for name, data in items:
        averages.setdefault(name, []).append(data)
    for name, data in averages.items():
        averages[name] = sum(data) / len(data)
    return averages

Then use your list: 然后使用您的列表:

data = [[name1, 10.10], [name2, 12.12], [name1, 9.90], [name3, 22.20], [name3, 7.70]]
dataAverages = averageItems(data) # {name3: 14.95, name2: 12.12, name1: 10.0}
  1. Build a dictionary whose values are lists of numbers, using the .setdefault() method of dictionaries. 使用字典的.setdefault()方法构建一个值是数字列表的字典。
  2. Build a list using the builtins sum and len to compute the mean. 使用内建的sumlen来构建列表以计算平均值。

Using the ipython interpreter 使用ipython解释器

In [1]: l = [['name1', 10.10], ['name2', 12.12], ['name1', 9.90], ['name3', 22.20], ['name3', 7.70]]
In [2]: d = {}
In [3]: for k, v in l: d[k]=d.setdefault(k,[])+[v]
In [4]: [[k,sum(d[k])/len(d[k])] for k in d]
Out[4]: [['name2', 12.12], ['name3', 14.95], ['name1', 10.0]]
In [5]: del d
In [6]: 

Prompted by Kevin's comment to the OP about the eventual requirement of conserving the order of labels in the original list, I'd suggest using an OrderedDict from the collections module 凯文(Kevin)对OP提出的关于最终要求保留原始列表中标签顺序的评论提示,我建议使用collections模块中的OrderedDict

In [19]: from collections import OrderedDict
In [20]: d = OrderedDict()
In [21]: for k, v in l: d[k]=d.setdefault(k,[])+[v]
In [22]: [[k,sum(d[k])/len(d[k])] for k in d]
Out[22]: [['name1', 10.0], ['name2', 12.12], ['name3', 14.95]]

You can use a collections.defaultdict to store all the scores for each name in a single list and then use statistics.mean if you have python >= 3.4 to calculate the average: 您可以使用collections.defaultdict将每个名称的所有分数存储在一个列表中,然后使用python> = 3.4来计算平均值的statistics.mean

from collections import defaultdict
from statistics import mean

l = [['name1', 10.10], ['name2', 12.12], ['name1', 9.90], ['name3', 22.20], ['name3', 7.70]]


details = defaultdict(list)

for name, score in l:
    details[name].append(score)

If you want to keep the dict structure just update the values: 如果要保留dict结构,只需更新值即可:

for name, scores in details.items():
    details[name] = mean(scores)

print(details)
defaultdict(<class 'list'>, {'name3': 14.95, 'name1': 10.0, 'name2': 12.12})

Or create a list using a list comprehension: 或使用列表理解来创建列表:

print([[name ,mean(scores)] for name,scores in details.items()])
[['name1', 10.0], ['name3', 14.95], ['name2', 12.12]]

Obviously without using mean you can simple calcualte it yourself: 显然,无需使用平均值,您就可以自己简单地计算:

print([[name , sum(scores)/len(scores)] for name,scores in details.items()])

If the order matters then use a collections.OrderedDict : 如果订单很重要,请使用collections.OrderedDict

from collections import OrderedDict
details = OrderedDict()

for name, score in l:
    details.setdefault(name,[])
    details[name].append(score)

print([[name , sum(scores)/len(scores)] for name,scores in details.items()])
from collections import defaultdict
from operator import add

d = defaultdict(list)
pairs = [[name1, 10.10], [name2, 12.12], [name1, 9.90], [name3, 22.20], [name3, 7.70]]

for name, val in pairs: 
    d[name].append(val)
print [(name, reduce(add, vals)/len(vals)) for name, vals in d.items()]

I think this should work, and it's fairly clean too. 我认为这应该可行,而且也很干净。 We create a defaultdict and append each value for each name to a list, then reduce those down by adding, and then divide by length to get an average. 我们创建一个defaultdict,并将每个名称的每个值附加到列表中,然后通过相加来减少它们,然后按长度除以得到平均值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM