简体   繁体   English

如何在python 3中实现OrderedDict的插入

[英]How to implement insert for OrderedDict in python 3

I want to insert an item into an OrderedDict at a certain position. 我想在某个位置将一个项目插入OrderedDict。 Using the gist of this SO answer i have the problem that it doesn't work on python 3. 使用这个 SO答案的要点 ,我有一个问题,它在python 3上不起作用。

This is the implementation used 这是使用的实现

from collections import OrderedDict

class ListDict(OrderedDict):

    def __init__(self, *args, **kwargs):
        super(ListDict, self).__init__(*args, **kwargs)

    def __insertion(self, link_prev, key_value):
        key, value = key_value
        if link_prev[2] != key:
            if key in self:
                del self[key]
            link_next = link_prev[1]
            self._OrderedDict__map[key] = link_prev[1] = link_next[0] = [link_prev, link_next, key]
        dict.__setitem__(self, key, value)

    def insert_after(self, existing_key, key_value):
        self.__insertion(self._OrderedDict__map[existing_key], key_value)

    def insert_before(self, existing_key, key_value):
        self.__insertion(self._OrderedDict__map[existing_key][0], key_value)

Using it like 使用它就像

ld = ListDict([(1,1), (2,2), (3,3)])
ld.insert_before(2, (1.5, 1.5))

gives

File "...", line 35, in insert_before
    self.__insertion(self._OrderedDict__map[existing_key][0], key_value)
AttributeError: 'ListDict' object has no attribute '_OrderedDict__map'

It works with python 2.7. 它适用于python 2.7。 What is the reason that it fails in python 3? 它在python 3中失败的原因是什么? Checking the source code of the OrderedDict implementation shows that self.__map is used instead of self._OrderedDict__map . 检查OrderedDict实现的源代码显示使用self.__map而不是self._OrderedDict__map Changing the code to the usage of self.__map gives 将代码更改为self.__map的用法给出

AttributeError: 'ListDict' object has no attribute '_ListDict__map'

How come? 怎么会? And how can i make this work in python 3? 我怎样才能在python 3中完成这项工作? OrderedDict uses the internal __map attribute to store a doubly linked list. OrderedDict使用内部__map属性来存储双向链表。 So how can i access this attribute properly? 那么我该如何正确访问这个属性呢?

I'm not sure you wouldn't be better served just keeping up with a separate list and dict in your code, but here is a stab at a pure Python implementation of such an object. 我不确定你会不会更好地服从你的代码中的单独列表和dict,但这里是对这样一个对象的纯Python实现的抨击。 This will be an order of magnitude slower than an actual OrderedDict in Python 3.5, which as I pointed out in my comment has been rewritten in C . 这将比Python 3.5中的实际OrderedDict慢一个数量级,正如我在评论中指出的那样, 已经在C中重写了

"""
A list/dict hybrid; like OrderedDict with insert_before and insert_after
"""
import collections.abc


class MutableOrderingDict(collections.abc.MutableMapping):
    def __init__(self, iterable_or_mapping=None, **kw):
        # This mimics dict's initialization and accepts the same arguments
        # Of course, you have to pass an ordered iterable or mapping unless you
        # want the order to be arbitrary. Garbage in, garbage out and all :)
        self.__data = {}
        self.__keys = []
        if iterable_or_mapping is not None:
            try:
                iterable = iterable_or_mapping.items()
            except AttributeError:
                iterable = iterable_or_mapping
            for key, value in iterable:
                self.__keys.append(key)
                self.__data[key] = value
        for key, value in kw.items():
            self.__keys.append(key)
            self.__data[key] = value

    def insert_before(self, key, new_key, value):
        try:
            self.__keys.insert(self.__keys.index(key), new_key)
        except ValueError:
            raise KeyError(key) from ValueError
        else:
            self.__data[new_key] = value

    def insert_after(self, key, new_key, value):
        try:
            self.__keys.insert(self.__keys.index(key) + 1, new_key)
        except ValueError:
            raise KeyError(key) from ValueError
        else:
            self.__data[new_key] = value

    def __getitem__(self, key):
        return self.__data[key]

    def __setitem__(self, key, value):
        self.__keys.append(key)
        self.__data[key] = value

    def __delitem__(self, key):
        del self.__data[key]
        self.__keys.remove(key)

    def __iter__(self):
        return iter(self.__keys)

    def __len__(self):
        return len(self.__keys)

    def __contains__(self, key):
        return key in self.__keys

    def __eq__(self, other):
        try:
            return (self.__data == dict(other.items()) and
                    self.__keys == list(other.keys()))
        except AttributeError:
            return False

    def keys(self):
        for key in self.__keys:
            yield key

    def items(self):
        for key in self.__keys:
            yield key, self.__data[key]

    def values(self):
        for key in self.__keys:
            yield self.__data[key]

    def get(self, key, default=None):
        try:
            return self.__data[key]
        except KeyError:
            return default

    def pop(self, key, default=None):
        value = self.get(key, default)
        self.__delitem__(key)
        return value

    def popitem(self):
        try:
            return self.__data.pop(self.__keys.pop())
        except IndexError:
            raise KeyError('%s is empty' % self.__class__.__name__)


    def clear(self):
        self.__keys = []
        self.__data = {}

    def update(self, mapping):
        for key, value in mapping.items():
            self.__keys.append(key)
            self.__data[key] = value

    def setdefault(self, key, default):
        try:
            return self[key]
        except KeyError:
            self[key] = default
            return self[key]

    def __repr__(self):
        return 'MutableOrderingDict(%s)' % ', '.join(('%r: %r' % (k, v)
                                                      for k, v in self.items()))

I ended up implementing the whole collections.abc.MutableMapping contract because none of the methods were very long, but you probably won't use all of them. 我最终实现了整个collections.abc.MutableMapping契约,因为没有一个方法很长,但你可能不会全部使用它们。 In particular, __eq__ and popitem are a little arbitrary. 特别是, __eq__popitem是有点任意的。 I changed your signature on the insert_* methods to a 4-argument one that feels a little more natural to me. 我将insert_*方法上的签名更改为一个对我来说更自然的4参数。 Final note: Only tested on Python 3.5. 最后说明:仅在Python 3.5上测试过。 Certainly will not work on Python 2 without some (minor) changes. 如果没有一些(次要)更改,当然不会在Python 2上工作。

Trying out the new dict object in 3.7 and thought I'd try to implement what Two-Bit Alchemist had done with his answer but just overriding the native dict class because in 3.7 dict's are ordered. 尝试在3.7中尝试新的dict对象,并认为我会尝试实现Two-Bit Alchemist用他的答案做的但只是覆盖了原生的dict类,因为3.7 dict是有序的。

''' Script that extends python3.7 dictionary to include insert_before and insert_after methods. '''
from sys import exit as sExit

class MutableDict(dict):
    ''' Class that extends python3.7 dictionary to include insert_before and insert_after methods. '''

    def insert_before(self, key, newKey, val):
        ''' Insert newKey:value into dict before key'''
        try:
            __keys = list(self.keys())
            __vals = list(self.values())

            insertAt = __keys.index(key)

            __keys.insert(insertAt, newKey)
            __vals.insert(insertAt, val)

            self.clear()
            self.update({x: __vals[i] for i, x in enumerate(__keys)})

        except ValueError as e:
            sExit(e)

    def insert_after(self, key, newKey, val):
        ''' Insert newKey:value into dict after key'''
        try:
            __keys = list(self.keys())
            __vals = list(self.values())

            insertAt = __keys.index(key) + 1

            if __keys[-1] != key:
                __keys.insert(insertAt, newKey)
                __vals.insert(insertAt, val)
                self.clear()
                self.update({x: __vals[i] for i, x in enumerate(__keys)})
            else:
                self.update({newKey: val})

        except ValueError as e:
            sExit(e)

A little testing: 一点点测试:

 In: v = MutableDict([('a', 1), ('b', 2), ('c', 3)])
Out: {'a': 1, 'b': 2, 'c': 3}

 In: v.insert_before('a', 'g', 5)
Out: {'g': 5, 'a': 1, 'b': 2, 'c': 3}

 In: v.insert_after('b', 't', 5)
Out: {'g': 5, 'a': 1, 'b': 2, 't': 5, 'c': 3}

Edit: I decided to do a little benchmark test to see what kind of performance hit this would take. 编辑:我决定做一点基准测试,看看这会带来什么样的性能。 I will use from timeit import timeit 我将使用from timeit import timeit

Get a baseline. 获得基线。 Create a dict with arbitrary values. 创建具有任意值的dict。

 In: timeit('{x: ord(x) for x in string.ascii_lowercase[:27]}', setup='import string', number=1000000)
Out: 1.8214202160015702

See how much longer it would take to initialize the MutableDict with the same arbitrary values as before. 看看使用与以前相同的任意值初始化MutableDict需要多长时间。

 In: timeit('MD({x: ord(x) for x in string.ascii_lowercase[:27]})', setup='import string; from MutableDict import MutableDict as MD', number=1000000)
Out: 2.382507269998314

1.82 / 2.38 = 0.76. 1.82 / 2.38 = 0.76。 So if I'm thinking about this right MutableDict is 24% slower on creation. 因此,如果我正在考虑这个问题,MutableDict在创建时会慢24%。

Lets see how long it takes to do an insert. 让我们看看插入需要多长时间。 For this test I'll use the insert_after method as it is slightly bigger. 对于这个测试,我将使用insert_after方法,因为它稍微大一些。 Will also look for a key close to the end for insertion. 还将寻找一个接近结尾的键进行插入。 't' in this case. 在这种情况下,'t'。

 In: timeit('v.insert_after("t", "zzrr", ord("z"))', setup='import string; from MutableDict import MutableDict as MD; v = MD({x: ord(x) for x in string.ascii_lowercase[:27]})' ,number=1000000)
Out: 3.9161406760104

2.38 / 3.91 = 0.60, 40% slower inserting_after than it's initialization. 2.38 / 3.91 = 0.60,插入__比初始化慢40%。 Not bad on a small test of 1 million loops. 在一百万个循环的小测试中也不错。 For a comparison in time relation we'll test this: 为了比较时间关系,我们将测试这个:

 In: timeit('"-".join(map(str, range(100)))', number=1000000)
Out: 10.342204540997045

Not quite an apples to apples comparison but I hope these tests will aid you in your(reader not necessarily OP) decision to use or not use this class in your 3.7 projects. 不太相似,但我希望这些测试能帮助您(读者不一定是OP)决定在3.7项目中使用或不使用此类。

Since Python 3.2, move_to_end can be used to move items around in an OrderedDict . 从Python 3.2开始, move_to_end可用于在OrderedDict移动项目。 The following code will implement the insert functionality by moving all items after the provided index to the end. 以下代码将通过将提供的索引之后的所有项目移动到结尾来实现insert功能。

Note that this isn't very efficient and should be used sparingly (if at all). 请注意,这不是很有效,应该谨慎使用(如果有的话)。

def ordered_dict_insert(ordered_dict, index, key, value):
    if key in ordered_dict:
        raise KeyError("Key already exists")
    if index < 0 or index > len(ordered_dict):
        raise IndexError("Index out of range")

    keys = list(ordered_dict.keys())[index:]
    ordered_dict[key] = value
    for k in keys:
        ordered_dict.move_to_end(k)

There are obvious optimizations and improvements that could be made, but that's the general idea. 可以做出明显的优化和改进,但这是一般的想法。

from collections import OrderedDict

od1 = OrderedDict([
    ('a', 1),
    ('b', 2),
    ('d', 4),
])

items = od1.items()
items.insert(2, ('c', 3))
od2 = OrderedDict(items)

print(od2)  # OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM