Is there a way to vectorize applying the mean function to masked regions in an ndarray?

Question

Let's say I have a two ndarray define as such:

import numpy as np
mask = np.array([[1,1],[1,2]])
values = np.array([[1., 3.],[2., 2.]])

My goal is to calculate the mean of the values based on the mask regions indicated by the integer in mask . Naturally, I would use a for-loop:

out = np.zeros(len(np.unique(mask)))
for j,i in enumerate(np.unique(mask)):
  out[j] = np.nanmean(values[mask==i])

However, this serialized solution becomes very slow for large, multidimensional arrays. Is there a way to vectorize this operation efficiently? Thank you for your help in advance!

Answer 1

You can use np.bincount :

unq,inv,cnt = np.unique(mask,return_inverse=1,return_counts=1)
np.bincount(inv,values.ravel())/cnt
# array([2., 2.])

Is there a way to vectorize applying the mean function to masked regions in an ndarray?

Question

1 answers

solution1
2 ACCPTED 2020-07-14 00:20:52

Is there a way to vectorize applying the mean function to masked regions in an ndarray?

Question

1 answers

solution1 2 ACCPTED 2020-07-14 00:20:52

solution1
2 ACCPTED 2020-07-14 00:20:52