Let's say I have a two ndarray define as such:
import numpy as np
mask = np.array([[1,1],[1,2]])
values = np.array([[1., 3.],[2., 2.]])
My goal is to calculate the mean of the values based on the mask regions indicated by the integer in mask
. Naturally, I would use a for-loop:
out = np.zeros(len(np.unique(mask)))
for j,i in enumerate(np.unique(mask)):
out[j] = np.nanmean(values[mask==i])
However, this serialized solution becomes very slow for large, multidimensional arrays. Is there a way to vectorize this operation efficiently? Thank you for your help in advance!
You can use np.bincount
:
unq,inv,cnt = np.unique(mask,return_inverse=1,return_counts=1)
np.bincount(inv,values.ravel())/cnt
# array([2., 2.])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.