[英]Get all elements in a list where the value is equal to certain value
I have a list which looks like this: 我有一个如下所示的列表:
[[3, 4.6575, 7.3725],
[3, 3.91, 5.694],
[2, 3.986666666666667, 6.6433333333333335],
[1, 3.9542857142857137, 5.674285714285714],....]
I would like to sum (in fact take the mean ... but it is a detail) all the values of the rows together where the value of the first element are equal. 我想总和(实际上取平均值...但它是一个细节)所有行的值在一起,其中第一个元素的值相等。 This would mean that in the example above the first two rows would be summed together.
这意味着在上面的例子中,前两行将被加在一起。
[[3, 8.5675, 13.0665],
[2, 3.986666666666667, 6.6433333333333335],
[1, 3.9542857142857137, 5.674285714285714],....]
This means the first values should be unique. 这意味着第一个值应该是唯一的。
I thought of doing this by finding all the "rows" where the first value is equal to for example to 1 and sum them together. 我想通过找到第一个值等于例如1的所有“行”并将它们加在一起来做到这一点。 My question is now, how can I find all the rows where the first value is equal to a certain value.
我现在的问题是,如何找到第一个值等于某个值的所有行。
There are many ways to do something like this in Python. 在Python中有很多方法可以做这样的事情。 If your list is called
a
, you can make a list comprehension to get the row indices where first column is equal to value
: 如果您的列表名为
a
,则可以进行列表推导以获取第一列等于value
的行索引:
rows = [i for i in range(0,len(a)) if a[i][0]==value]
However, I'm sure there are whole libraries that parse arrays or lists in X dimensions to retreive all kinds of statistical data out there. 但是,我确信有完整的库可以解析X维中的数组或列表,以便在那里检索各种统计数据。 The high number of libraries available is one of the many thing that make developing with Python such a fantastic experience.
可用的大量库是使用Python开发这么棒的体验的众多因素之一。
This should work: 这应该工作:
lst = [[3, 4.6575, 7.3725],
[3, 3.91, 5.694],
[2, 3.986666666666667, 6.6433333333333335],
[1, 3.9542857142857137, 5.674285714285714]]
# group the values in a dictionary
import collections
d = collections.defaultdict(list)
for item in lst:
d[item[0]].append(item)
# find sum of values
for key, value in d.items():
print [key] + map(sum, zip(*value)[1:])
Or, a bit cleaner, using itertools.groupby
: 或者,使用
itertools.groupby
更清洁一点:
import itertools
groups = itertools.groupby(lst, lambda i: i[0])
for key, value in groups:
print [key] + map(sum, zip(*value)[1:])
Output, in both cases: 两种情况下的输出:
[1, 3.9542857142857137, 5.674285714285714]
[2, 3.986666666666667, 6.6433333333333335]
[3, 8.567499999999999, 13.0665]
If you want to calculate the mean instead of the sum, just define your own mean
function and pass that one instead of the sum
function to map
: 如果你想计算均值而不是总和,只需定义你自己的
mean
函数并传递一个而不是sum
函数来map
:
mean = lambda x: sum(x) / float(len(x))
map(mean, zip...)
>>> from itertools import groupby
>>> alist
[[3, 4.6575, 7.3725], [3, 3.91, 5.694], [2, 3.986666666666667, 6.6433333333333335], [1, 3.9542857142857137, 5.674285714285714]]
>>> [reduce(lambda x, y: [key, x[1]+y[1], x[2]+y[2]], group) for key, group in groupby(alist, lambda x:x[0])]
[[3, 8.567499999999999, 13.0665], [2, 3.986666666666667, 6.6433333333333335], [1, 3.9542857142857137, 5.674285714285714]]
I just offer another solution using list comprehension, groupby
and reduce
. 我只是使用list comprehension,
groupby
和reduce
提供另一种解决方案。 reduce
has to be imported from functools
in py3.x. reduce
了从进口functools
在py3.x.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.