简体   繁体   English

通过索引访问 collections.OrderedDict 中的项目

[英]Accessing items in an collections.OrderedDict by index

Lets say I have the following code:可以说我有以下代码:

import collections
d = collections.OrderedDict()
d['foo'] = 'python'
d['bar'] = 'spam'

Is there a way I can access the items in a numbered manner, like:有没有一种方法可以以编号方式访问项目,例如:

d(0) #foo's Output
d(1) #bar's Output

If its an OrderedDict() you can easily access the elements by indexing by getting the tuples of (key,value) pairs as follows如果它是一个OrderedDict() ,您可以通过获取(键,值)对的元组进行索引来轻松访问元素,如下所示

>>> import collections
>>> d = collections.OrderedDict()
>>> d['foo'] = 'python'
>>> d['bar'] = 'spam'
>>> d.items()
[('foo', 'python'), ('bar', 'spam')]
>>> d.items()[0]
('foo', 'python')
>>> d.items()[1]
('bar', 'spam')

Note for Python 3.X Python 3.X注意事项

dict.items would return an iterable dict view object rather than a list. dict.items会返回一个可迭代的字典视图 object而不是一个列表。 We need to wrap the call onto a list in order to make the indexing possible我们需要将调用包装到一个列表中,以便使索引成为可能

>>> items = list(d.items())
>>> items
[('foo', 'python'), ('bar', 'spam')]
>>> items[0]
('foo', 'python')
>>> items[1]
('bar', 'spam')

Do you have to use an OrderedDict or do you specifically want a map-like type that's ordered in some way with fast positional indexing?您是否必须使用 OrderedDict,或者您是否特别想要一种以某种方式排序并具有快速位置索引的类似地图的类型? If the latter, then consider one of Python's many sorted dict types (which orders key-value pairs based on key sort order).如果是后者,则考虑 Python 的许多排序字典类型之一(它根据键排序顺序对键值对进行排序)。 Some implementations also support fast indexing.一些实现还支持快速索引。 For example, the sortedcontainers project has aSortedDict type for just this purpose.例如, sortedcontainers项目有一个SortedDict类型就是为了这个目的。

>>> from sortedcontainers import SortedDict
>>> sd = SortedDict()
>>> sd['foo'] = 'python'
>>> sd['bar'] = 'spam'
>>> print sd.iloc[0] # Note that 'bar' comes before 'foo' in sort order.
'bar'
>>> # If you want the value, then simple do a key lookup:
>>> print sd[sd.iloc[1]]
'python'

Here is a special case if you want the first entry (or close to it) in an OrderedDict, without creating a list.如果您想要 OrderedDict 中的第一个条目(或接近它的条目)而不创建列表,这是一种特殊情况。 (This has been updated to Python 3): (这个已经更新到Python 3):

>>> from collections import OrderedDict
>>> 
>>> d = OrderedDict()
>>> d["foo"] = "one"
>>> d["bar"] = "two"
>>> d["baz"] = "three"
>>> next(iter(d.items()))
('foo', 'one')
>>> next(iter(d.values()))
'one'

(The first time you say "next()", it really means "first.") (第一次说“next()”时,它的真正意思是“首先”。)

In my informal test, next(iter(d.items())) with a small OrderedDict is only a tiny bit faster than items()[0] .在我的非正式测试中,带有小 OrderedDict 的next(iter(d.items()))只比items()[0]快一点点。 With an OrderedDict of 10,000 entries, next(iter(d.items())) was about 200 times faster than items()[0] .对于包含 10,000 个条目的 OrderedDict, next(iter(d.items()))items()[0]快大约 200 倍。

BUT if you save the items() list once and then use the list a lot, that could be faster.但是,如果您保存 items() 列表一次,然后多次使用该列表,那可能会更快。 Or if you repeatedly { create an items() iterator and step through it to to the position you want }, that could be slower.或者,如果您重复{创建一个 items() 迭代器并单步执行到您想要的 position},那可能会更慢。

It is dramatically more efficient to use IndexedOrderedDict from the indexed package.使用indexed package中的 IndexedOrderedDict 效率显着提高。

Following Niklas's comment, I have done a benchmark on OrderedDict and IndexedOrderedDict with 1000 entries.根据 Niklas 的评论,我对OrderedDictIndexedOrderedDict进行了 1000 个条目的基准测试。

In [1]: from numpy import *
In [2]: from indexed import IndexedOrderedDict
In [3]: id=IndexedOrderedDict(zip(arange(1000),random.random(1000)))
In [4]: timeit id.keys()[56]
1000000 loops, best of 3: 969 ns per loop

In [8]: from collections import OrderedDict
In [9]: od=OrderedDict(zip(arange(1000),random.random(1000)))
In [10]: timeit od.keys()[56]
10000 loops, best of 3: 104 µs per loop

IndexedOrderedDict is ~100 times faster in indexing elements at specific position in this specific case.在这种特定情况下,IndexedOrderedDict 在特定position处索引元素的速度快了约 100 倍。

Other solutions listed require an extra step.列出的其他解决方案需要额外的步骤。 IndexedOrderedDict is a drop-in replacement for OrderedDict , except it's indexable. IndexedOrderedDictOrderedDict的直接替代品,除了它是可索引的。

This community wiki attempts to collect existing answers.此社区 wiki 试图收集现有的答案。

Python 2.7 Python 2.7

In python 2, the keys() , values() , and items() functions of OrderedDict return lists.在 python 2 中, OrderedDictkeys()values()items()函数返回列表。 Using values as an example, the simplest way isvalues为例,最简单的方法是

d.values()[0]  # "python"
d.values()[1]  # "spam"

For large collections where you only care about a single index, you can avoid creating the full list using the generator versions, iterkeys , itervalues and iteritems :对于您只关心单个索引的大型 collections,您可以避免使用生成器版本、 iterkeysitervaluesiteritems创建完整列表:

import itertools
next(itertools.islice(d.itervalues(), 0, 1))  # "python"
next(itertools.islice(d.itervalues(), 1, 2))  # "spam"

The indexed.py package provides IndexedOrderedDict , which is designed for this use case and will be the fastest option. indexed.py package 提供了IndexedOrderedDict ,它专为这种用例而设计,将是最快的选择。

from indexed import IndexedOrderedDict
d = IndexedOrderedDict({'foo':'python','bar':'spam'})
d.values()[0]  # "python"
d.values()[1]  # "spam"

Using itervalues can be considerably faster for large dictionaries with random access:对于具有随机访问权限的大型词典,使用 itervalues 会快得多:

$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 1000;   d = OrderedDict({i:i for i in range(size)})'  'i = randint(0, size-1); d.values()[i:i+1]'
1000 loops, best of 3: 259 usec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 10000;  d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i:i+1]'
100 loops, best of 3: 2.3 msec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 100000; d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i:i+1]'
10 loops, best of 3: 24.5 msec per loop

$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 1000;   d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); next(itertools.islice(d.itervalues(), i, i+1))'
10000 loops, best of 3: 118 usec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 10000;  d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); next(itertools.islice(d.itervalues(), i, i+1))'
1000 loops, best of 3: 1.26 msec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 100000; d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); next(itertools.islice(d.itervalues(), i, i+1))'
100 loops, best of 3: 10.9 msec per loop

$ python2 -m timeit -s 'from indexed import IndexedOrderedDict; from random import randint; size = 1000;   d = IndexedOrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i]'
100000 loops, best of 3: 2.19 usec per loop
$ python2 -m timeit -s 'from indexed import IndexedOrderedDict; from random import randint; size = 10000;  d = IndexedOrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i]'
100000 loops, best of 3: 2.24 usec per loop
$ python2 -m timeit -s 'from indexed import IndexedOrderedDict; from random import randint; size = 100000; d = IndexedOrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i]'
100000 loops, best of 3: 2.61 usec per loop

+--------+-----------+----------------+---------+
|  size  | list (ms) | generator (ms) | indexed |
+--------+-----------+----------------+---------+
|   1000 | .259      | .118           | .00219  |
|  10000 | 2.3       | 1.26           | .00224  |
| 100000 | 24.5      | 10.9           | .00261  |
+--------+-----------+----------------+---------+

Python 3.6 Python 3.6

Python 3 has the same two basic options (list vs generator), but the dict methods return generators by default. Python 3 具有相同的两个基本选项(列表与生成器),但 dict 方法默认返回生成器。

List method:列表方法:

list(d.values())[0]  # "python"
list(d.values())[1]  # "spam"

Generator method:生成器方法:

import itertools
next(itertools.islice(d.values(), 0, 1))  # "python"
next(itertools.islice(d.values(), 1, 2))  # "spam"

Python 3 dictionaries are an order of magnitude faster than python 2 and have similar speedups for using generators. Python 3 字典比 python 2 快一个数量级,并且使用生成器具有类似的加速。

+--------+-----------+----------------+---------+
|  size  | list (ms) | generator (ms) | indexed |
+--------+-----------+----------------+---------+
|   1000 | .0316     | .0165          | .00262  |
|  10000 | .288      | .166           | .00294  |
| 100000 | 3.53      | 1.48           | .00332  |
+--------+-----------+----------------+---------+

It's a new era and with Python 3.6.1 dictionaries now retain their order.这是一个新时代,Python 3.6.1 词典现在保留了它们的顺序。 These semantics aren't explicit because that would require BDFL approval.这些语义并不明确,因为这需要 BDFL 批准。 But Raymond Hettinger is the next best thing (and funnier) and he makes a pretty strong case that dictionaries will be ordered for a very long time.但 Raymond Hettinger 是下一个最好的东西(而且更有趣),他提出了一个非常有力的理由,即字典将在很长一段时间内被订购。

So now it's easy to create slices of a dictionary:所以现在很容易创建字典的切片:

test_dict = {
                'first':  1,
                'second': 2,
                'third':  3,
                'fourth': 4
            }

list(test_dict.items())[:2]

Note: Dictonary insertion-order preservation is now official in Python 3.7 .注意:字典插入顺序保存现已在 Python 3.7 正式发布

If you have pandas installed, you can convert the ordered dict to a pandas Series .如果安装了pandas ,则可以将有序字典转换为 pandas Series This will allow random access to the dictionary elements.这将允许随机访问字典元素。

>>> import collections
>>> import pandas as pd
>>> d = collections.OrderedDict()
>>> d['foo'] = 'python'
>>> d['bar'] = 'spam'

>>> s = pd.Series(d)

>>> s['bar']
spam
>>> s.iloc[1]
spam
>>> s.index[1]
bar

for OrderedDict() you can access the elements by indexing by getting the tuples of (key,value) pairs as follows or using '.values()'对于 OrderedDict() ,您可以通过如下方式获取(键,值)对的元组或使用“.values()”进行索引来访问元素

>>> import collections
>>> d = collections.OrderedDict()
>>> d['foo'] = 'python'
>>> d['bar'] = 'spam'
>>> d.items()
[('foo', 'python'), ('bar', 'spam')]
>>>d.values()
odict_values(['python','spam'])
>>>list(d.values())
['python','spam']

If you're dealing with fixed number of keys that you know in advance, use Python's inbuilt namedtuples instead.如果您要处理预先知道的固定数量的键,请改用 Python 的内置命名元组。 A possible use-case is when you want to store some constant data and access it throughout the program by both indexing and specifying keys.一个可能的用例是当您想要存储一些常量数据并通过索引和指定键在整个程序中访问它时。

import collections
ordered_keys = ['foo', 'bar']
D = collections.namedtuple('D', ordered_keys)
d = D(foo='python', bar='spam')

Access by indexing:通过索引访问:

d[0] # result: python
d[1] # result: spam

Access by specifying keys:通过指定键访问:

d.foo # result: python
d.bar # result: spam

Or better:或更好:

getattr(d, 'foo') # result: python
getattr(d, 'bar') # result: spam

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Django:'collections.OrderedDict' object 不可调用 - Django : 'collections.OrderedDict' object is not callable “collections.OrderedDict”对象没有属性 - 'collections.OrderedDict' object has no attribute Python 如何将 collections.OrderedDict 转换为 dataFrame - Python How to convert collections.OrderedDict to dataFrame collections.OrderedDict 不适用于 json.dump() - collections.OrderedDict not working on json.dump() Python - collections.OrderedDict() 未正确排序字典 - Python - collections.OrderedDict() is not ordering dictionary properly 使用collections.OrderedDict是不好的做法吗? - Is it bad practice to use collections.OrderedDict? Django AttributeError: 'collections.OrderedDict' 对象没有属性 'pk' - Django AttributeError: 'collections.OrderedDict' object has no attribute 'pk' 创建从collections.OrderedDict继承的类的实例的浅表副本 - Creating a shallow copy of an instance of a class inheriting from collections.OrderedDict AttributeError: 'collections.OrderedDict' object 没有属性 'value_counts' - AttributeError: 'collections.OrderedDict' object has no attribute 'value_counts' AttributeError: 'collections.OrderedDict' 对象没有属性 'split' - AttributeError: 'collections.OrderedDict' object has no attribute 'split'
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM