简体   繁体   English

泡菜序列化顺序之谜

[英]Pickle serialization order mystery

Update 6/8/17 更新17年6月8日

Though 3 years passed, my PR is still pending as a temporary solution by enforcing the output order. 尽管3年过去了,但我的PR仍在通过强制执行输出订单作为临时解决方案,以等待解决。 Stream-Framework might reconsider its design on using content as key for notifications. 流框架可能会重新考虑其设计,即使用内容作为通知的关键。 GitHub Issue #153 references this. GitHub 第153期引用了此内容。

Question

See following sample: 请参阅以下示例:

import pickle
x = {'order_number': 'X', 'deal_url': 'J'}

pickle.dumps(x)
pickle.dumps(pickle.loads(pickle.dumps(x)))
pickle.dumps(pickle.loads(pickle.dumps(pickle.loads(pickle.dumps(x)))))

Results: 结果:

(dp0\nS'deal_url'\np1\nS'J'\np2\nsS'order_number'\np3\nS'X'\np4\ns.
(dp0\nS'order_number'\np1\nS'X'\np2\nsS'deal_url'\np3\nS'J'\np4\ns.
(dp0\nS'deal_url'\np1\nS'J'\np2\nsS'order_number'\np3\nS'X'\np4\ns.

Clearly, serialized output changes for every dump. 显然,每个转储的序列化输出都会发生变化。 When I remove a character from any of keys, this doesn't happen. 当我从任何键中删除字符时,都不会发生。 I discovered this as Stream-Framework use pickled output as key for storage of notifications on its k/v store. 我发现这是因为Stream-Framework使用腌制的输出作为在其k / v存储中存储通知的键。 I will pull request if we get a better understanding what is going on here. 如果我们对这里发生的事情有更好的了解,我将提出要求。 I have found two solutions to prevent it: 我发现了两种解决方案来防止它:

A - Convert to dictionary after sorting (yes, somehow provides the intended side effect) A-排序后转换为字典(是的,以某种方式提供了预期的副作用)

import operator
sorted_x = dict(sorted(x.iteritems(), key=operator.itemgetter(1)))

B - Remove underscores (but not sure if this always works) B-删除下划线(但不确定是否始终有效)

So what causes the mystery under dictionary sorting for pickle? 那么,什么使字典下的腌菜之谜神秘呢?

Proof that calling sort over dict provides dump to produce same result: 通过dict调用sort可以提供转储以产生相同结果的证明:

import operator
x = dict(sorted(x.iteritems(), key=operator.itemgetter(1)))

pickle.dumps(x)
"(dp0\nS'order_number'\np1\nS'X'\np2\nsS'deal_url'\np3\nS'J'\np4\ns."

x = pickle.loads(pickle.dumps(x))
x = dict(sorted(x.iteritems(), key=operator.itemgetter(1)))

pickle.dumps(x)
"(dp0\nS'order_number'\np1\nS'X'\np2\nsS'deal_url'\np3\nS'J'\np4\ns."

Dictionaries are unsorted data structures. 字典是未排序的数据结构。 This means that the order is arbitrary and pickle will store them as they are. 这意味着顺序是任意的,泡菜将按原样存储它们。 You can use the collections.OrderedDict if you want to use a sorted dictionary. 如果要使用排序字典,可以使用collections.OrderedDict

Any order you think you see when you're playing around in the interpreter is just the interpreter playing nice with you. 当您在口译员中玩耍时,您认为看到的任何顺序都只是口译员和您一起玩的很好。

From the documentation of dict : dict的文档中:

It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique (within one dictionary) 最好将字典视为无序的键集:值对,并要求键是唯一的(在一个字典中)

Remember that the functions dict.keys() , dict.values() and dict.items() also return their respective values in arbitrary order. 请记住,功能dict.keys() dict.values()dict.items()也以任意顺序返回各自的价值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM