简体   繁体   English

numpy.unique基于计数的排序

[英]numpy.unique sort based on counts

The numpy.unique function allows to return the counts of unique elements if return_counts is True . 如果return_countsTruenumpy.unique函数允许返回唯一元素的计数。 Now the returned tuple consists of two arrays one containing the unique elements and the 2nd one containing a count array, both are sorted by the unique elements. 现在返回的元组由两个包含唯一元素的数组组成,第二个包含一个count数组,两者都按唯一元素排序。 Now is there a way to have both sorted according to the counts array instead of the unique elements? 现在有没有办法根据计数数组而不是唯一元素进行排序? I mean I know how to do it the hard way but is there some concise one-liner or lambda functionality for such cases? 我的意思是我知道如何以艰难的方式去做,但是这种情况下是否有一些简洁的单行或lambda功能?

Current result: 目前的结果:

my_chr_list = ["a","a","a", "b", "c", "b","d", "d"]
unique_els, counts = np.unique(my_chr_list, return_counts=True)
print(unique_els, counts)

Which returns something along the lines of this: 返回的内容如下:

>>> (array(['a', 'b', 'c', 'd'], 
     dtype='<U1'), array([3, 2, 1, 2], dtype=int64))

However, what I would want to have: 但是,我想要的是:

>>> (array(['a', 'b', 'd', 'c'], 
     dtype='<U1'), array([3, 2, 2, 1], dtype=int64))

You can't do this directly with unique function. 您无法使用unique功能直接执行此操作。 Instead as a Numpythonic approach, you can use return_index keyword to get the indices of the unique items then use np.argsort to get the indices of the sorted count items and use the result to find the items based on their frequency. 相反,作为Numpythonic方法,您可以使用return_index关键字获取唯一项目的索引,然后使用np.argsort获取已排序count项目的索引,并使用结果根据其频率查找项目。

In [33]: arr = np.array(my_chr_list)

In [34]: u, count = np.unique(my_chr_list, return_counts=True)

In [35]: count_sort_ind = np.argsort(-count)

In [36]: u[count_sort_ind]
Out[36]: 
array(['a', 'b', 'd', 'c'], 
      dtype='<U1')

In [37]: count[count_sort_ind]
Out[37]: array([3, 2, 2, 1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM