I want to find frequency of elements of a given one dimensional numpy array ( arr1
) in another one dimensional numpy array ( arr2
). The array arr1
contains elements with no repetitions. Also, all elements in arr1
are part of the array of unique elements of arr2
Consider this as an example,
arr1 = np.array([1,2,6])
arr2 = np.array([2, 3, 6, 1, 2, 1, 2, 0, 2, 0])
At present, I am using the following:
freq = np.zeros( len(arr1) )
for i in range( len(arr1) ):
mark = np.where( arr2==arr1[i] )
freq[i] = len(mark[0])
print freq
>>[2, 4, 1]
The aforementioned method gives me the correct answer. But, I want to know if there is a better/more efficient method than the one that I am following.
Here's a vectorized solution based on np.searchsorted
-
idx = np.searchsorted(arr1,arr2)
idx[idx==len(arr1)] = 0
mask = arr1[idx]==arr2
out = np.bincount(idx[mask])
It assumes arr1
is sorted. If not so, we got two solutions :
Sort arr1
as the pre-processing step. Since, arr1
is part of unique elements from arr2
, this should be a comparatively smaller array and hence an inexpensive sorting operation.
Use sorter
arg with searchsorted
to compute idx
:
sidx = arr1.argsort()
; idx = sidx[np.searchsorted(arr1,arr2,sorter=sidx)]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.