简体   繁体   中英

Counting occurrences of elements of one array in another array

I want to find frequency of elements of a given one dimensional numpy array ( arr1 ) in another one dimensional numpy array ( arr2 ). The array arr1 contains elements with no repetitions. Also, all elements in arr1 are part of the array of unique elements of arr2

Consider this as an example,

arr1 = np.array([1,2,6])
arr2 = np.array([2, 3, 6, 1, 2, 1, 2, 0, 2, 0])

At present, I am using the following:

freq = np.zeros(  len(arr1)  )

for i in range( len(arr1) ):
    mark = np.where( arr2==arr1[i] )
    freq[i] = len(mark[0])

print freq
>>[2, 4, 1]

The aforementioned method gives me the correct answer. But, I want to know if there is a better/more efficient method than the one that I am following.

Here's a vectorized solution based on np.searchsorted -

idx = np.searchsorted(arr1,arr2)
idx[idx==len(arr1)] = 0
mask = arr1[idx]==arr2
out = np.bincount(idx[mask])

It assumes arr1 is sorted. If not so, we got two solutions :

  1. Sort arr1 as the pre-processing step. Since, arr1 is part of unique elements from arr2 , this should be a comparatively smaller array and hence an inexpensive sorting operation.

  2. Use sorter arg with searchsorted to compute idx :

    sidx = arr1.argsort() ; idx = sidx[np.searchsorted(arr1,arr2,sorter=sidx)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM