简体   繁体   English

使用collections.OrderedDict是不好的做法吗?

[英]Is it bad practice to use collections.OrderedDict?

I like to use collections.OrderedDict sometimes when I need an associative array where the order of the keys should be retained. 我喜欢使用collections.OrderedDict有时我需要一个关联数组,其中应该保留键的顺序。 Best example I have of this is in parsing or creating csv files, where it's useful to have the order of columns retained implicitly in the object. 我有这个的最好的例子是解析或创建csv文件,其中将列的顺序隐式保留在对象中是有用的。

But I'm worried that this is bad practice, since it seems to me that the whole concept of an associative array is that the order of the keys should never matter, and that any operations which rely on ordering should just use lists because that's why lists exist (this can be done for the csv example above). 但是我担心这是不好的做法,因为在我看来,关联数组的整个概念是键的顺序永远不重要,并且依赖于排序的任何操作都应该只使用列表,因为这就是为什么列表存在(这可以在上面的csv示例中完成)。 I don't have data on this, but I'm willing to bet that the performance for lists is universally better than OrderedDict. 我没有这方面的数据,但我愿意打赌,列表的性能普遍优于OrderedDict。

So my question is: Are there any really compelling use cases for OrderedDict? 所以我的问题是:OrderedDict有没有真正引人注目的用例? Is the csv use case a good example of where it should be used or a bad one? csv用例是一个很好的例子,它应该在哪里使用还是坏的?

But I'm worried that this is bad practice, since it seems to me that the whole concept of an associative array is that the order of the keys should never matter, 但我担心这是不好的做法,因为在我看来,关联数组的整个概念是键的顺序永远不重要,

Nonsense. 废话。 That's not the "whole concept of an associative array". 这不是“关联数组的整个概念”。 It's just that the order rarely matters and so we default to surrendering the order to get a conceptually simpler (and more efficient) data structure. 这只是订单很少重要,所以我们默认放弃订单以获得概念上更简单(和更有效)的数据结构。

and that any operations which rely on ordering should just use lists because that's why lists exist 并且任何依赖于排序的操作都应该只使用列表,因为这就是列表存在的原因

Stop it right there! 在那里停下来! Think a second. 想一想。 How would you use lists? 如何使用列表? As a list of (key, value) pairs, with unique keys, right? 作为(键,值)对的列表,使用唯一键,对吧? Well congratulations , my friend, you just re-invented OrderedDict, just with an awful API and really slow. 恭喜 ,我的朋友,你刚刚重新发明了OrderedDict,只是用了一个糟糕的API并且非常慢。 Any conceptual objections to an ordered mapping would apply to this ad hoc data structure as well. 对有序映射的任何概念性异议也适用于此临时数据结构。 Luckily, those objections are nonsense. 幸运的是,这些反对意见是无稽之谈。 Ordered mappings are perfectly fine, they're just different from unordered mappings. 有序映射非常精细,它们与无序映射完全不同。 Giving it an aptly-named dedicated implementation with a good API and good performance improves people's code. 通过良好的API和良好的性能为其提供具有恰当命名的专用实现,可以改善人们的代码。

Aside from that: Lists are only one kind of ordered data structure. 除此之外:列表只是一种有序数据结构。 And while they are somewhat universal in that you can virtually all data structures out of some combination of lists (if you bend over backwards), that doesn't mean you should always use lists. 虽然它们在某种程度上是通用的,因为几乎所有数据结构都可以用于某些列表组合(如果你向后弯曲),这并不意味着你应该总是使用列表。

I don't have data on this, but I'm willing to bet that the performance for lists is universally better than OrderedDict. 我没有这方面的数据,但我愿意打赌,列表的性能普遍优于OrderedDict。

Data (structures) doesn't (don't) have performance. 数据(结构)没有(不)具有性能。 Operations on data (structures) have. 对数据(结构)的操作有。 And thus it depends on what operations you're interested in. If you just need a list of pairs, a list is obviously correct, and iterating over it or indexing it is quite efficient. 因此,它取决于您感兴趣的操作。如果您只需要一对列表,列表显然是正确的,迭代它或索引它是非常有效的。 However, if you want a mapping that's also ordered, or even a tiny subset of mapping functionality (such as handling duplicate keys), then a list alone is pretty awful, as I already explained above. 但是,如果您想要一个也是有序的映射,或者甚至是一小部分映射功能(例如处理重复键),那么单独一个列表就非常糟糕,正如我上面已经解释过的那样。

For your specific use case (writing csv files) an ordered dict is not necessary. 对于您的特定用例(编写csv文件),不需要有序的字典。 Instead, use a DictWriter . 相反,使用DictWriter

Personally I use OrderedDict when I need some LIFO/FIFO access, for which is even has a the popitem method. 我个人在需要一些LIFO / FIFO访问时使用OrderedDict ,因为它甚至有一个popitem方法。 I honestly couldn't think of a good use case, but the one mentioned at PEP-0327 for attribute order is a good one: 老实说,我想不出一个好用例,但在PEP-0327中提到的属性顺序是一个很好的用例:

XML/HTML processing libraries currently drop the ordering of attributes, use a list instead of a dict which makes filtering cumbersome, or implement their own ordered dictionary. XML / HTML处理库目前删除了属性的排序,使用列表而不是dict,这使得过滤变得麻烦,或者实现他们自己的有序字典。 This affects ElementTree, html5lib, Genshi and many more libraries. 这会影响ElementTree,html5lib,Genshi和更多库。

If you are ever questioning why there is some feature in Python, the PEP is a good place to start because that's where the justification that leads to the inclusion of the feature is detailed. 如果您曾经质疑为什么Python中存在某些功能,那么PEP是一个很好的起点,因为这是导致包含该功能的理由的详细信息。

Probably a comment would suffice... 可能评论就足够了......

I think it would be questionable if you use it on places where you don't need it (where order is irrelevant and ordinary a dict would suffice). 我认为如果你在不需要它的地方使用它会有问题(顺序是无关紧要的,普通的dict就足够了)。 Otherwise the code will probably be simpler than using lists. 否则代码可能比使用列表更简单。

This is valid for any language construct/library - if it makes your code simpler, use the higher level abstraction/implementation. 这对任何语言构造/库都有效 - 如果它使代码更简单,则使用更高级别的抽象/实现。

As long as you feel comfortable with this data structure, and that it fits your needs, why caring? 只要您对这种数据结构感到满意,并且它符合您的需求,为什么要关心? Perhaps it is not the more efficient one (in term of speed, etc.), but, if it's there, it's obviously because it's useful in certain cases (or nobody would have thought of writing it). 也许它不是更有效的(在速度等方面),但是,如果它在那里,它显然是因为它在某些情况下是有用的(或者没有人会想到写它)。

You can basically use three types of associative arrays in Python: 您基本上可以在Python中使用三种类型的关联数组:

  1. the classic hash table (no order at all) 经典哈希表(根本没有订单)
  2. the OrderedDict (order which mirrors the way the object was created) OrderedDict (反映对象创建方式的顺序)
  3. and the binary trees - this is not in the standard lib -, which order their keys exactly as you want, in a custom order (not necessarily the alphabetical one). 和二进制树 - 这不在标准的lib中 - 它按照你想要的顺序排列它们的自定义顺序(不一定是字母顺序)。

So, in fact, the order of the keys can matter. 所以,实际上,该键的顺序能够决定事情。 Just choose the structure that you think is the more appropriate to do the job. 只需选择您认为更适合完成工作的结构。

For CSV and similar constructs of repeated keys use a namedtuple. 对于CSV和类似的重复键构造,请使用namedtuple。 It is best of both worlds. 这两个世界都是最好的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么collections.OrderedDict使用try和except来初始化变量? - Why does collections.OrderedDict use try and except to initialize variables? Django:'collections.OrderedDict' object 不可调用 - Django : 'collections.OrderedDict' object is not callable “collections.OrderedDict”对象没有属性 - 'collections.OrderedDict' object has no attribute Python 如何将 collections.OrderedDict 转换为 dataFrame - Python How to convert collections.OrderedDict to dataFrame collections.OrderedDict 不适用于 json.dump() - collections.OrderedDict not working on json.dump() Python - collections.OrderedDict() 未正确排序字典 - Python - collections.OrderedDict() is not ordering dictionary properly 通过索引访问 collections.OrderedDict 中的项目 - Accessing items in an collections.OrderedDict by index Django AttributeError: 'collections.OrderedDict' 对象没有属性 'pk' - Django AttributeError: 'collections.OrderedDict' object has no attribute 'pk' 创建从collections.OrderedDict继承的类的实例的浅表副本 - Creating a shallow copy of an instance of a class inheriting from collections.OrderedDict AttributeError: 'collections.OrderedDict' object 没有属性 'value_counts' - AttributeError: 'collections.OrderedDict' object has no attribute 'value_counts'
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM