[英]How to add values in one array according to repeated values in another array?
Suppose I have an array: 假设我有一个数组:
Values = np.array([0.221,0.35,25.9,54.212,0.0022])
Indices = np.array([22,10,11,22,10])
I would like to add elements of 'Values' together that share the same number in 'Indices'. 我想将“值”的元素添加在一起,这些元素在“索引”中共享相同的数字。
In other words, my desired outputs(s): 换句话说,我想要的输出是:
Total = np.array([0.221+54.212,0.35+0.002,25.9])
Index = np.array([22,10,11])
I've been trying to use np.unique to no avail. 我一直在尝试使用np.unique无济于事。 Can't quite figure this out!
不能完全弄清楚!
We can use np.unique
with its optional arg return_inverse
to get IDs based on uniqueness within Indices
and then use those with bincount
to get binned (ID based) summations and hence solve it like so - 我们可以使用
np.unique
及其可选的arg return_inverse
来基于Indices
唯一性获取ID,然后使用具有bincount
ID来获取bincount
(基于ID)的总和,从而像这样解决它-
Index,idx = np.unique(Indices, return_inverse=True)
Total = np.bincount(idx, Values)
Outputs for given sample - 给定样本的输出-
In [32]: Index
Out[32]: array([10, 11, 22])
In [33]: Total
Out[33]: array([ 0.3522, 25.9 , 54.433 ])
Alternatively, we can use pandas.factorize
to get the unique IDs and then bincount as shown earlier. 另外,我们可以使用
pandas.factorize
来获取唯一的ID,然后再进行bincount,如先前所示。 So, the first step could be replaced by something like this - 因此,第一步可以被这样的东西代替-
import pandas as pd
idx,Index = pd.factorize(Indices)
One possibility is to consider using Pandas : 一种可能性是考虑使用Pandas :
In [14]: import pandas as pd
In [15]: pd.DataFrame({'Values': Values, 'Indices': Indices}).groupby('Indices').agg(sum)
Out[15]:
Values
Indices
10 0.3522
11 25.9000
22 54.4330
This should be self-explanatory, though it doesn't preserve the order of indices (it's not entirely clear from the question whether you care about that). 这应该是不言自明的,尽管它并不能保留索引的顺序(从这个问题是否完全关心您还不能完全清楚这一点)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.