简体   繁体   English

numpy.unique 给出了错误的 output 的集合列表

[英]numpy.unique gives wrong output for list of sets

I have a list of sets given by,我有一个由给出的集合列表,

sets1 = [{1},{2},{1}]

When I find the unique elements in this list using numpy's unique , I get当我使用 numpy 的unique在此列表中找到唯一元素时,我得到

np.unique(sets1)
Out[18]: array([{1}, {2}, {1}], dtype=object)

As can be seen seen, the result is wrong as {1} is repeated in the output.可以看出,结果是错误的,因为在 output 中重复了{1}

When I change the order in the input by making similar elements adjacent, this doesn't happen.当我通过使相似元素相邻来更改输入中的顺序时,这不会发生。

sets2 = [{1},{1},{2}]

np.unique(sets2)
Out[21]: array([{1}, {2}], dtype=object)

Why does this occur?为什么会出现这种情况? Or is there something wrong in the way I have done?还是我做的方式有问题?

What happens here is that the np.unique function is based on the np._unique1d function from NumPy (see the code here ), which itself uses the .sort() method. 这里发生的情况是np.unique function 基于来自 NumPy 的np._unique1d .sort()

Now, sorting a list of sets that contain only one integer in each set will not result in a list with each set ordered by the value of the integer present in the set.现在,对每个集合中仅包含一个 integer 的集合列表进行排序将不会生成一个列表,其中每个集合都按集合中存在的 integer 的值排序。 So we will have (and that is not what we want):所以我们将拥有(这不是我们想要的):

sets = [{1},{2},{1}]
sets.sort()
print(sets)

# > [{1},{2},{1}]
# ie. the list has not been "sorted" like we want it to

Now, as you have pointed out, if the list of sets is already ordered in the way you want, np.unique will work (since you would have sorted the list beforehand).现在,正如您所指出的,如果集合列表已经按照您想要的方式排序,则np.unique将起作用(因为您会事先对列表进行排序)。

One specific solution (though, please be aware that it will only work for a list of sets that each contain a single integer) would then be:一个特定的解决方案(但请注意,它仅适用于每个包含单个整数的集合列表)将是:

np.unique(sorted(sets, key=lambda x: next(iter(x))))

That is because set is unhashable type那是因为 set 是不可散列的类型

{1} is {1} # will give False

you can use python collections.Counter if you can can convert the set to tuple like below您可以使用 python collections.Counter如果您可以将集合转换为如下所示的元组

from collections import Counter
sets1 = [{1},{2},{1}]
Counter([tuple(a) for a in sets1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM