简体   繁体   English

为什么 Python 集合不能散列?

[英]Why aren't Python sets hashable?

I stumbled across a blog post detailing how to implement a powerset function in Python.我偶然发现了一篇博客文章,详细介绍了如何在 Python 中实现 powerset function。 So I went about trying my own way of doing it, and discovered that Python apparently cannot have a set of sets, since set is not hashable.所以我开始尝试自己的方法,发现 Python 显然不能有一组集合,因为集合是不可散列的。 This is irksome, since the definition of a powerset is that it is a set of sets, and I wanted to implement it using actual set operations.这很烦人,因为幂集的定义是它是一组集合,我想使用实际的集合操作来实现它。

>>> set([ set() ])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'

Is there a good reason Python sets are not hashable?是否有充分的理由 Python 集不可散列?

Generally, only immutable objects are hashable in Python.通常,在 Python 中,只有不可变对象是可散列的。 The immutable variant of set() -- frozenset() -- is hashable. set()的不可变变体frozenset() ——是可散列的。

Because they're mutable.因为它们是可变的。

If they were hashable, a hash could silently become "invalid", and that would pretty much make hashing pointless.如果它们是可散列的,则 hash 可能会默默地变为“无效”,这几乎会使散列变得毫无意义。

From the Python docs:来自 Python 文档:

hashable可散列的
An object is hashable if it has a hash value which never changes during its lifetime (it needs a hash () method), and can be compared to other objects (it needs an eq () or cmp () method).如果 object 的 hash 值在其生命周期内永远不会改变(它需要hash () 方法),并且可以与其他对象比较( eq )或cmp () 方法(它需要),则它是可散列的。 Hashable objects which compare equal must have the same hash value.比较相等的可散列对象必须具有相同的 hash 值。

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.哈希性使 object 可用作字典键和集合成员,因为这些数据结构在内部使用 hash 值。

All of Python's immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Python 的所有不可变内置对象都是可散列的,而没有可变容器(例如列表或字典)是可散列的。 Objects which are instances of user-defined classes are hashable by default;默认情况下,作为用户定义类实例的对象是可散列的; they all compare unequal, and their hash value is their id().它们都比较不相等,它们的 hash 值是它们的 id()。

In case this helps... if you really need to convert unhashable things into hashable equivalents for some reason you might do something like this:万一这有帮助...如果您出于某种原因确实需要将不可散列的事物转换为可散列的等效项,则可以执行以下操作:

from collections import Hashable, MutableSet, MutableSequence, MutableMapping

def make_hashdict(value):
    """
    Inspired by https://stackoverflow.com/questions/1151658/python-hashable-dicts
     - with the added bonus that it inherits from the dict type of value
       so OrderedDict's maintain their order and other subclasses of dict() maintain their attributes
    """
    map_type = type(value)

    class HashableDict(map_type):
        def __init__(self, *args, **kwargs):
            super(HashableDict, self).__init__(*args, **kwargs)
        def __hash__(self):
            return hash(tuple(sorted(self.items())))

    hashDict = HashableDict(value)

    return hashDict


def make_hashable(value):
    if not isinstance(value, Hashable):
        if isinstance(value, MutableSet):
            value = frozenset(value)
        elif isinstance(value, MutableSequence):
            value = tuple(value)
        elif isinstance(value, MutableMapping):
            value = make_hashdict(value)

        return value

my_set = set()
my_set.add(make_hashable(['a', 'list']))
my_set.add(make_hashable({'a': 1, 'dict': 2}))
my_set.add(make_hashable({'a', 'new', 'set'}))

print my_set

My HashableDict implementation is the simplest and least rigorous example from here .我的 HashableDict 实现是此处最简单且最不严格的示例。 If you need a more advanced HashableDict that supports pickling and other things, check the many other implementations.如果您需要支持酸洗和其他东西的更高级的 HashableDict,请检查许多其他实现。 In my version above I wanted to preserve the original dict class, thus preserving the order of OrderedDicts.在我上面的版本中,我想保留原始字典 class,从而保留 OrderedDicts 的顺序。 I also use AttrDict from here for attribute-like access.我还从这里使用 AttrDict 进行类似属性的访问。

My example above is not in any way authoritative, just my solution to a similar problem where I needed to store some things in a set and needed to "hashify" them first.我上面的例子在任何方面都不是权威的,只是我对一个类似问题的解决方案,我需要将一些东西存储在一个集合中,并且需要先“散列”它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM