简体   繁体   English

密钥不是python字典的唯一!

[英]Keys are not unique for a python dictionary!

A stupid newbie question here For a python dictionary q len(set(q.keys())) != len(q.keys()) . 这里有一个愚蠢的新手问题对于python字典q len(set(q.keys())) != len(q.keys()) Is that even possible? 这甚至可能吗?

This can happen if you violate a requirement of dict , and change its hash. 如果违反dict的要求并更改其哈希值,就会发生这种情况。

When an object is used in a dict , its hash value must not change, and its equality to other objects must not change. dict使用对象时,其哈希值不得更改,并且其与其他对象的相等性不得更改。 Other properties may change, as long as they don't affect how it appears to the dict. 其他属性可能会改变,只要它们不影响它对dict的显示方式。

(This does not mean that a hash value is never allowed to change. That's a common misconception. Hash values themselves may change. It's only dict which requires that key hashes be immutable, not __hash__ itself.) (这并不意味着哈希值是绝不允许改变。这是一个常见的误解。哈希值本身可能改变。这是唯一dict这就要求重点哈希是一成不变的,没有__hash__本身)。

The following code adds an object to a dict, then changes its hash out from under the dict. 下面的代码将一个对象添加到dict,然后从dict下面更改其散列。 q[a] = 2 then adds a as a new key in the dict, even though it's already present; q[a] = 2然后在dict中添加a新键,即使它已经存在; since the hash value changed, the dict doesn't find the old value. 由于哈希值已更改,因此dict未找到旧值。 This reproduces the peculiarity you saw. 这再现了你所看到的特殊性。

class Test(object):
    def __init__(self, h):
        self.h = h
    def __hash__(self):
        return self.h

a = Test(1)
q = {}
q[a] = 1
a.h = 2
q[a] = 2

print q

# True:
print len(set(q.keys())) != len(q.keys())

The underlying code for dictionaries and sets is substantially the same, so you can usually expect that len(set(d.keys()) == len(d.keys()) is an invariant. 字典和集合的基础代码基本相同,因此您通常可以预期len(set(d.keys()) == len(d.keys())是一个不变量。

That said, both sets and dicts depend on __eq__ and __hash__ to identify unique values and to organize them for efficient search. 也就是说,set和dicts都依赖于__eq__和__hash__来识别唯一值并组织它们以进行有效搜索。 So, if those return inconsistent results (or violate the rule that "a==b implies hash(a)==hash(b)", then there is no way to enforce the invariant: 因此,如果那些返回不一致的结果(或违反“a == b暗示hash(a)== hash(b)”的规则,则无法强制执行不变量:

>>> from random import randrange
>>> class A():
    def __init__(self, x):
        self.x = x
    def __eq__(self, other):
        return bool(randrange(2))
    def __hash__(self):
        return randrange(8)
    def __repr__(self):
        return '|%d|' % self.x


>>> s = [A(i) for i in range(100)]
>>> d = dict.fromkeys(s)
>>> len(d.keys())
29
>>> len(set(d.keys()))
12

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM