简体   繁体   中英

Is there a way to vectorize applying the mean function to masked regions in an ndarray?

Let's say I have a two ndarray define as such:

import numpy as np
mask = np.array([[1,1],[1,2]])
values = np.array([[1., 3.],[2., 2.]])

My goal is to calculate the mean of the values based on the mask regions indicated by the integer in mask . Naturally, I would use a for-loop:

out = np.zeros(len(np.unique(mask)))
for j,i in enumerate(np.unique(mask)):
  out[j] = np.nanmean(values[mask==i])

However, this serialized solution becomes very slow for large, multidimensional arrays. Is there a way to vectorize this operation efficiently? Thank you for your help in advance!

You can use np.bincount :

unq,inv,cnt = np.unique(mask,return_inverse=1,return_counts=1)
np.bincount(inv,values.ravel())/cnt
# array([2., 2.])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM