繁体   English   中英

“冻结的字典”会是什么?

[英]What would a "frozen dict" be?

  • 冻结集是冻结集。
  • 冻结列表可能是一个元组。
  • 冻结的字典会是什么? 一个不可变的、可散列的字典。

我想它可能类似于collections.namedtuple ,但这更像是一个冻结键字典(一个半冻结的字典)。 不是吗?

“frozendict”应该是一个冻结的字典,它应该有keysvaluesget等,并支持infor等。

更新 :
* 它是: https ://www.python.org/dev/peps/pep-0603

Python 没有内置的 frozendict 类型。 事实证明这不会经常有用(尽管它仍然可能比frozenset更经常有用)。

需要这种类型的最常见原因是在记忆函数调用具有未知参数的函数时。 存储 dict 的可散列等效项(其中值是可散列的)的最常见解决方案类似于tuple(sorted(kwargs.items()))

这取决于排序是否有点疯狂。 Python 不能肯定地承诺排序会在这里产生一些合理的结果。 (但它不能承诺太多其他东西,所以不要太多汗水。)


你可以很容易地制作某种类似于 dict 的包装器。 它可能看起来像

import collections

class FrozenDict(collections.Mapping):
    """Don't forget the docstrings!!"""
    
    def __init__(self, *args, **kwargs):
        self._d = dict(*args, **kwargs)
        self._hash = None

    def __iter__(self):
        return iter(self._d)

    def __len__(self):
        return len(self._d)

    def __getitem__(self, key):
        return self._d[key]

    def __hash__(self):
        # It would have been simpler and maybe more obvious to 
        # use hash(tuple(sorted(self._d.iteritems()))) from this discussion
        # so far, but this solution is O(n). I don't know what kind of 
        # n we are going to run into, but sometimes it's hard to resist the 
        # urge to optimize when it will gain improved algorithmic performance.
        if self._hash is None:
            hash_ = 0
            for pair in self.items():
                hash_ ^= hash(pair)
            self._hash = hash_
        return self._hash

它应该很好用:

>>> x = FrozenDict(a=1, b=2)
>>> y = FrozenDict(a=1, b=2)
>>> x is y
False
>>> x == y
True
>>> x == {'a': 1, 'b': 2}
True
>>> d = {x: 'foo'}
>>> d[y]
'foo'

奇怪的是,虽然我们有很少有用的frozenset ,但仍然没有冻结映射。 这个想法在PEP 416 中被拒绝 - 添加一个 frozendict 内置类型 这个想法可能会在以后的 Python 版本中重新讨论,请参阅PEP 603 - 向集合添加 freezemap 类型

所以 Python 2 解决方案:

def foo(config={'a': 1}):
    ...

似乎仍然是通常的:

def foo(config=None):
    if config is None:
        config = {'a': 1}  # default config
    ...

在 Python 3 中,您可以选择

from types import MappingProxyType

default_config = {'a': 1}
DEFAULTS = MappingProxyType(default_config)

def foo(config=DEFAULTS):
    ...

现在默认配置可以动态更新,但在您希望它通过代理传递而不可变的地方保持不变。

因此, default_config中的更改将按预期更新DEFAULTS ,但您不能写入映射代理对象本身。

诚然,它与“不可变的、可散列的 dict”并不是一回事,但它可能是 freezeddict 某些用例的不错替代品。

假设字典的键和值本身是不可变的(例如字符串),那么:

>>> d
{'forever': 'atones', 'minks': 'cards', 'overhands': 'warranted', 
 'hardhearted': 'tartly', 'gradations': 'snorkeled'}
>>> t = tuple((k, d[k]) for k in sorted(d.keys()))
>>> hash(t)
1524953596

没有fronzedict ,但您可以使用通过 Python 3.3 添加到标准库中的MappingProxyType

>>> from types import MappingProxyType
>>> foo = MappingProxyType({'a': 1})
>>> foo
mappingproxy({'a': 1})
>>> foo['a'] = 2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'mappingproxy' object does not support item assignment
>>> foo
mappingproxy({'a': 1})

每次编写这样的函数时,我都会想到frozendict:

def do_something(blah, optional_dict_parm=None):
    if optional_dict_parm is None:
        optional_dict_parm = {}

这是我一直在使用的代码。 我将frozenset子类化。 这样做的好处如下。

  1. 这是一个真正不可变的对象。 不依赖未来用户和开发人员的良好行为。
  2. 在常规字典和冻结字典之间来回转换很容易。 FrozenDict(orig_dict) --> 冻结字典。 dict(frozen_dict) --> 普通字典。

2015 年 1 月 21 日更新:我在 2014 年发布的原始代码使用 for 循环来查找匹配的键。 那速度非常慢。 现在我整理了一个利用frozenset 的散列特性的实现。 键值对存储在特殊容器中,其中__hash____eq__函数仅基于键。 与我在 2014 年 8 月在这里发布的代码不同,这段代码也经过了正式的单元测试。

MIT 风格的许可证。

if 3 / 2 == 1:
    version = 2
elif 3 / 2 == 1.5:
    version = 3

def col(i):
    ''' For binding named attributes to spots inside subclasses of tuple.'''
    g = tuple.__getitem__
    @property
    def _col(self):
        return g(self,i)
    return _col

class Item(tuple):
    ''' Designed for storing key-value pairs inside
        a FrozenDict, which itself is a subclass of frozenset.
        The __hash__ is overloaded to return the hash of only the key.
        __eq__ is overloaded so that normally it only checks whether the Item's
        key is equal to the other object, HOWEVER, if the other object itself
        is an instance of Item, it checks BOTH the key and value for equality.

        WARNING: Do not use this class for any purpose other than to contain
        key value pairs inside FrozenDict!!!!

        The __eq__ operator is overloaded in such a way that it violates a
        fundamental property of mathematics. That property, which says that
        a == b and b == c implies a == c, does not hold for this object.
        Here's a demonstration:
            [in]  >>> x = Item(('a',4))
            [in]  >>> y = Item(('a',5))
            [in]  >>> hash('a')
            [out] >>> 194817700
            [in]  >>> hash(x)
            [out] >>> 194817700
            [in]  >>> hash(y)
            [out] >>> 194817700
            [in]  >>> 'a' == x
            [out] >>> True
            [in]  >>> 'a' == y
            [out] >>> True
            [in]  >>> x == y
            [out] >>> False
    '''

    __slots__ = ()
    key, value = col(0), col(1)
    def __hash__(self):
        return hash(self.key)
    def __eq__(self, other):
        if isinstance(other, Item):
            return tuple.__eq__(self, other)
        return self.key == other
    def __ne__(self, other):
        return not self.__eq__(other)
    def __str__(self):
        return '%r: %r' % self
    def __repr__(self):
        return 'Item((%r, %r))' % self

class FrozenDict(frozenset):
    ''' Behaves in most ways like a regular dictionary, except that it's immutable.
        It differs from other implementations because it doesn't subclass "dict".
        Instead it subclasses "frozenset" which guarantees immutability.
        FrozenDict instances are created with the same arguments used to initialize
        regular dictionaries, and has all the same methods.
            [in]  >>> f = FrozenDict(x=3,y=4,z=5)
            [in]  >>> f['x']
            [out] >>> 3
            [in]  >>> f['a'] = 0
            [out] >>> TypeError: 'FrozenDict' object does not support item assignment

        FrozenDict can accept un-hashable values, but FrozenDict is only hashable if its values are hashable.
            [in]  >>> f = FrozenDict(x=3,y=4,z=5)
            [in]  >>> hash(f)
            [out] >>> 646626455
            [in]  >>> g = FrozenDict(x=3,y=4,z=[])
            [in]  >>> hash(g)
            [out] >>> TypeError: unhashable type: 'list'

        FrozenDict interacts with dictionary objects as though it were a dict itself.
            [in]  >>> original = dict(x=3,y=4,z=5)
            [in]  >>> frozen = FrozenDict(x=3,y=4,z=5)
            [in]  >>> original == frozen
            [out] >>> True

        FrozenDict supports bi-directional conversions with regular dictionaries.
            [in]  >>> original = {'x': 3, 'y': 4, 'z': 5}
            [in]  >>> FrozenDict(original)
            [out] >>> FrozenDict({'x': 3, 'y': 4, 'z': 5})
            [in]  >>> dict(FrozenDict(original))
            [out] >>> {'x': 3, 'y': 4, 'z': 5}   '''

    __slots__ = ()
    def __new__(cls, orig={}, **kw):
        if kw:
            d = dict(orig, **kw)
            items = map(Item, d.items())
        else:
            try:
                items = map(Item, orig.items())
            except AttributeError:
                items = map(Item, orig)
        return frozenset.__new__(cls, items)

    def __repr__(self):
        cls = self.__class__.__name__
        items = frozenset.__iter__(self)
        _repr = ', '.join(map(str,items))
        return '%s({%s})' % (cls, _repr)

    def __getitem__(self, key):
        if key not in self:
            raise KeyError(key)
        diff = self.difference
        item = diff(diff({key}))
        key, value = set(item).pop()
        return value

    def get(self, key, default=None):
        if key not in self:
            return default
        return self[key]

    def __iter__(self):
        items = frozenset.__iter__(self)
        return map(lambda i: i.key, items)

    def keys(self):
        items = frozenset.__iter__(self)
        return map(lambda i: i.key, items)

    def values(self):
        items = frozenset.__iter__(self)
        return map(lambda i: i.value, items)

    def items(self):
        items = frozenset.__iter__(self)
        return map(tuple, items)

    def copy(self):
        cls = self.__class__
        items = frozenset.copy(self)
        dupl = frozenset.__new__(cls, items)
        return dupl

    @classmethod
    def fromkeys(cls, keys, value):
        d = dict.fromkeys(keys,value)
        return cls(d)

    def __hash__(self):
        kv = tuple.__hash__
        items = frozenset.__iter__(self)
        return hash(frozenset(map(kv, items)))

    def __eq__(self, other):
        if not isinstance(other, FrozenDict):
            try:
                other = FrozenDict(other)
            except Exception:
                return False
        return frozenset.__eq__(self, other)

    def __ne__(self, other):
        return not self.__eq__(other)


if version == 2:
    #Here are the Python2 modifications
    class Python2(FrozenDict):
        def __iter__(self):
            items = frozenset.__iter__(self)
            for i in items:
                yield i.key

        def iterkeys(self):
            items = frozenset.__iter__(self)
            for i in items:
                yield i.key

        def itervalues(self):
            items = frozenset.__iter__(self)
            for i in items:
                yield i.value

        def iteritems(self):
            items = frozenset.__iter__(self)
            for i in items:
                yield (i.key, i.value)

        def has_key(self, key):
            return key in self

        def viewkeys(self):
            return dict(self).viewkeys()

        def viewvalues(self):
            return dict(self).viewvalues()

        def viewitems(self):
            return dict(self).viewitems()

    #If this is Python2, rebuild the class
    #from scratch rather than use a subclass
    py3 = FrozenDict.__dict__
    py3 = {k: py3[k] for k in py3}
    py2 = {}
    py2.update(py3)
    dct = Python2.__dict__
    py2.update({k: dct[k] for k in dct})

    FrozenDict = type('FrozenDict', (frozenset,), py2)

安装frozendict

pip install frozendict

用它!

from frozendict import frozendict

def smth(param = frozendict({})):
    pass

您可以使用frozendict包中的utilspie作为:

>>> from utilspie.collectionsutils import frozendict

>>> my_dict = frozendict({1: 3, 4: 5})
>>> my_dict  # object of `frozendict` type
frozendict({1: 3, 4: 5})

# Hashable
>>> {my_dict: 4}
{frozendict({1: 3, 4: 5}): 4}

# Immutable
>>> my_dict[1] = 5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mquadri/workspace/utilspie/utilspie/collectionsutils/collections_utils.py", line 44, in __setitem__
    self.__setitem__.__name__, type(self).__name__))
AttributeError: You can not call '__setitem__()' for 'frozendict' object

根据文件

freezedict(dict_obj) :接受 dict 类型的 obj 并返回一个可散列且不可变的 dict

是的,这是我的第二个答案,但这是一种完全不同的方法。 第一个实现是纯python。 这个在 Cython。 如果您知道如何使用和编译 Cython 模块,这与普通字典一样快。 检索单个值大约需要 0.04 到 0.06 微秒。

这是文件“frozen_dict.pyx”

import cython
from collections import Mapping

cdef class dict_wrapper:
    cdef object d
    cdef int h

    def __init__(self, *args, **kw):
        self.d = dict(*args, **kw)
        self.h = -1

    def __len__(self):
        return len(self.d)

    def __iter__(self):
        return iter(self.d)

    def __getitem__(self, key):
        return self.d[key]

    def __hash__(self):
        if self.h == -1:
            self.h = hash(frozenset(self.d.iteritems()))
        return self.h

class FrozenDict(dict_wrapper, Mapping):
    def __repr__(self):
        c = type(self).__name__
        r = ', '.join('%r: %r' % (k,self[k]) for k in self)
        return '%s({%s})' % (c, r)

__all__ = ['FrozenDict']

这是文件“setup.py”

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize('frozen_dict.pyx')
)

如果您安装了 Cython,请将上面的两个文件保存到同一目录中。 在命令行中移动到该目录。

python setup.py build_ext --inplace
python setup.py install

你应该完成了。

namedtuple的主要缺点是在使用前需要指定,所以对于单用例来说不太方便。

但是,有一种实用的解决方法可用于处理许多此类情况。 假设您希望拥有以下 dict 的不可变等价物:

MY_CONSTANT = {
    'something': 123,
    'something_else': 456
}

可以这样模拟:

from collections import namedtuple

MY_CONSTANT = namedtuple('MyConstant', 'something something_else')(123, 456)

甚至可以编写一个辅助函数来自动执行此操作:

def freeze_dict(data):
    from collections import namedtuple
    keys = sorted(data.keys())
    frozen_type = namedtuple(''.join(keys), keys)
    return frozen_type(**data)

a = {'foo':'bar', 'x':'y'}
fa = freeze_dict(data)
assert a['foo'] == fa.foo

当然,这只适用于平面字典,但实现递归版本应该不会太难。

子类dict

我在野外看到了这种模式(github)并想提一下:

class FrozenDict(dict):
    def __init__(self, *args, **kwargs):
        self._hash = None
        super(FrozenDict, self).__init__(*args, **kwargs)

    def __hash__(self):
        if self._hash is None:
            self._hash = hash(tuple(sorted(self.items())))  # iteritems() on py2
        return self._hash

    def _immutable(self, *args, **kws):
        raise TypeError('cannot change object - object is immutable')

    # makes (deep)copy alot more efficient
    def __copy__(self):
        return self

    def __deepcopy__(self, memo=None):
        if memo is not None:
            memo[id(self)] = self
        return self

    __setitem__ = _immutable
    __delitem__ = _immutable
    pop = _immutable
    popitem = _immutable
    clear = _immutable
    update = _immutable
    setdefault = _immutable

示例用法:

d1 = FrozenDict({'a': 1, 'b': 2})
d2 = FrozenDict({'a': 1, 'b': 2})
d1.keys() 
assert isinstance(d1, dict)
assert len(set([d1, d2])) == 1  # hashable

优点

  • 支持get()keys()items()iteritems() )和所有来自dict的好东西,开箱即用,无需显式实现它们
  • 使用内部dict这意味着性能( dict在 CPython 中用 c 编写)
  • 优雅简约没有黑魔法
  • isinstance(my_frozen_dict, dict)返回 True - 尽管 python 鼓励使用isinstance()对许多包进行鸭式输入,但这可以节省许多调整和自定义

缺点

  • 任何子类都可以覆盖它或在内部访问它(你不能真正 100% 保护 python 中的某些东西,你应该信任你的用户并提供良好的文档)。
  • 如果你关心速度,你可能想让__hash__快一点。

另一个选项是multidict包中的MultiDictProxy类。

您可以使用一个枚举:

import enum

KeyDict1 = enum.Enum('KeyDict1', {'InnerDictKey1':'bla', 'InnerDictKey2 ':2})

d = { KeyDict1: 'whatever', KeyDict2: 1, ...}

您可以像访问字典一样访问枚举:

KeyDict1['InnerDictKey2'].value  # This is 2

您可以遍历名称,并获取其值...它可以完成您期望的一切。

freeze实现了可散列、类型提示的冻结集合(dict、list 和 set),并将递归地冻结您为您提供的数据(如果可能)。

pip install frz

用法:

from freeze import FDict

a_mutable_dict = {
    "list": [1, 2],
    "set": {3, 4},
}

a_frozen_dict = FDict(a_mutable_dict)

print(repr(a_frozen_dict)) 
# FDict: {'list': FList: (1, 2), 'set': FSet: {3, 4}}

在没有母语支持的情况下,您可以自己做,也可以使用现有的解决方案。 幸运的是,Python 让扩展其基本实现变得非常简单。

class frozen_dict(dict):
    def __setitem__(self, key, value):
        raise Exception('Frozen dictionaries cannot be mutated')

frozen_dict = frozen_dict({'foo': 'FOO' })
print(frozen['foo']) # FOO
frozen['foo'] = 'NEWFOO' # Exception: Frozen dictionaries cannot be mutated

# OR

from types import MappingProxyType

frozen_dict = MappingProxyType({'foo': 'FOO'})
print(frozen_dict['foo']) # FOO
frozen_dict['foo'] = 'NEWFOO' # TypeError: 'mappingproxy' object does not support item assignment

我需要在某个时刻访问固定键,以获得某种全局恒定的东西,我决定这样做:

class MyFrozenDict:
    def __getitem__(self, key):
        if key == 'mykey1':
            return 0
        if key == 'mykey2':
            return "another value"
        raise KeyError(key)

像这样使用它

a = MyFrozenDict()
print(a['mykey1'])

警告:对于大多数用例,我不建议这样做,因为它会带来一些非常严重的权衡。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM