How to insert values into numpy array with groupby summation

Question

I have an empty numpy array a , an array with values that should be inserted v , and array with indeces, where these values should be inserted i . I want to insert values from array v into array a using indeces i . It can be done by simply a[i] = v when values in i are unique.

How to do that if values in i have duplicates and I want to compute sum of duplicates?

In case of duplicate indeces in i , only the last occurence in i will be used:

from numpy import *
a = zeros(5)
i = array([1, 1, 2, 3])
v = array([10, 20, 30, 40])
a[i] = v
print(a) # [ 0. 20. 30. 40.  0.]

A loop over i works, but it is slow:

for j1, j2 in enumerate(i):
    a[j2] += v[j1]
print(a) # [ 0. 30. 30. 40.  0.]

An algorithm with iterative search, use and removal of unique values in i is too complex for this simple task.

How to do this summation without a loop?

Answer 1

A similar problem was here: Add multiple values to one numpy array index

The answer is:

add.at(a, i, v)

Answer 2

The proposed answer by @Anton is pretty good. You can also use np.bincount with weights which is a built in function for this purpose:

a = np.bincount(i,v,minlength=5)
#[ 0. 30. 30. 40.  0.]

Equivalent pandas groupby solution:

df = pd.DataFrame(v).groupby(i).sum()
a[df.index] = df.to_numpy().flatten()
#[ 0. 30. 30. 40.  0.]

You can also use np.diff or np.searchsorted to achieve this goal too. I find the above ones more readable.

How to insert values into numpy array with groupby summation

Question

2 answers

solution1
0 2020-10-21 08:39:20

solution2
0 2020-10-21 11:10:46

How to insert values into numpy array with groupby summation

Question

2 answers

solution1 0 2020-10-21 08:39:20

solution2 0 2020-10-21 11:10:46

solution1
0 2020-10-21 08:39:20

solution2
0 2020-10-21 11:10:46