简体   繁体   English

在 numpy 数组列表中获取唯一值

[英]Get unique values in a list of numpy arrays

I have a list made up of arrays.我有一个由数组组成的列表。 All have shape (2,).都具有形状 (2,)。

Minimum example: mylist = [np.array([1,2]),np.array([1,2]),np.array([3,4])]最小示例: mylist = [np.array([1,2]),np.array([1,2]),np.array([3,4])]

I would like to get a unique list, eg [np.array([1,2]),np.array([3,4])]我想得到一个唯一的列表,例如[np.array([1,2]),np.array([3,4])]

or perhaps even better, a dict with counts, eg {np.array([1,2]) : 2, np.array([3,4]) : 1}或者甚至更好,一个带有计数的字典,例如{np.array([1,2]) : 2, np.array([3,4]) : 1}

So far I tried list(set(mylist)) , but the error is TypeError: unhashable type: 'numpy.ndarray'到目前为止,我尝试过list(set(mylist)) ,但错误是TypeError: unhashable type: 'numpy.ndarray'

As the error indicates, NumPy arrays aren't hashable.正如错误所示,NumPy 数组不可散列。 You can turn them to tuples, which are hashable and build a collections.Counter from the result:您可以将它们转换为可散列的元组,并从结果中构建一个collections.Counter

from collections import Counter

Counter(map(tuple,mylist))
# Counter({(1, 2): 2, (3, 4): 1})

If you wanted a list of unique tuples, you could construct a set :如果你想要一个唯一元组的列表,你可以构造一个set

set(map(tuple,mylist))
# {(1, 2), (3, 4)}

In general, the best option is to use np.unique method with custom parameters一般来说,最好的选择是使用带有自定义参数的np.unique方法

u, idx, counts = np.unique(X, axis=0, return_index=True, return_counts=True)

Then, according to documentation :然后,根据文档

  • u is an array of unique arrays u是唯一数组的数组
  • idx is the indices of the X that give the unique values idx是给出唯一值的X的索引
  • counts is the number of times each unique item appears in X counts是每个唯一项目在X出现的次数

If you need a dictionary, you can't store hashable values in its keys, so you might like to store them as tuples like in @yatu's answer or like this:如果您需要字典,则无法在其键中存储可hashable值,因此您可能希望将它们存储为元组,如@yatu 的回答或如下所示:

dict(zip([tuple(n) for n in u], counts))

Use the following:使用以下内容:

import numpy as np
mylist = [np.array([1,2]),np.array([1,2]),np.array([3,4])]
np.unique(mylist, axis=0)

This gives out list of uniques arrays.这给出了唯一数组的列表。

array([[1, 2],
       [3, 4]])

Source: https://numpy.org/devdocs/user/absolute_beginners.html#how-to-get-unique-items-and-counts来源: https : //numpy.org/devdocs/user/absolute_beginners.html#how-to-get-unique-items-and-counts

Pure numpy approach:纯 numpy 方法:

numpy.unique(mylist, axis=0)

which produces a 2d array with your unique arrays in rows:它生成一个二维数组,其中包含行中的唯一数组:

numpy.array([
 [1 2],
 [3 4]])

Works if all your arrays have same length (like in your example).如果您的所有数组都具有相同的长度(如您的示例中所示),则有效。 This solution can be useful depending on what you do earlier in your code: perhaps you would not need to get into plain Python at all, but stick to numpy instead, which should be faster.这个解决方案可能很有用,具体取决于您在代码中所做的事情:也许您根本不需要进入纯 Python,而是坚持使用 numpy,这应该会更快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM