简体   繁体   English

pickle.dumps在每次调用时返回不同的输出

[英]pickle.dumps returns a different output on each call

I have a simple python script which pickles an object and prints it. 我有一个简单的python脚本,可以腌制一个对象并将其打印出来。

import pickle

o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}
d = pickle.dumps(o) 
print(d)

Following are the outputs I get when i execute the same script multiple times: 以下是我多次执行同一脚本时得到的输出:

  • b'\\x80\\x03}q\\x00(X\\x05\\x00\\x00\\x00firstq\\x01K\\x01X\\x05\\x00\\x00\\x00thirdq\\x02K\\x03X\\x06\\x00\\x00\\x00secondq\\x03K\\x02X\\x02\\x00\\x00\\x00lsq\\x04]q\\x05(K\\x01K\\x02K\\x03eu.'

  • b'\\x80\\x03}q\\x00(X\\x05\\x00\\x00\\x00thirdq\\x01K\\x03X\\x02\\x00\\x00\\x00lsq\\x02]q\\x03(K\\x01K\\x02K\\x03eX\\x05\\x00\\x00\\x00firstq\\x04K\\x01X\\x06\\x00\\x00\\x00secondq\\x05K\\x02u.'

  • b'\\x80\\x03}q\\x00(X\\x05\\x00\\x00\\x00firstq\\x01K\\x01X\\x06\\x00\\x00\\x00secondq\\x02K\\x02X\\x02\\x00\\x00\\x00lsq\\x03]q\\x04(K\\x01K\\x02K\\x03eX\\x05\\x00\\x00\\x00thirdq\\x05K\\x03u.'

  • b'\\x80\\x03}q\\x00(X\\x05\\x00\\x00\\x00thirdq\\x01K\\x03X\\x05\\x00\\x00\\x00firstq\\x02K\\x01X\\x02\\x00\\x00\\x00lsq\\x03]q\\x04(K\\x01K\\x02K\\x03eX\\x06\\x00\\x00\\x00secondq\\x05K\\x02u.'

Is it just a difference in ordering of the properties of the object or is there more to it? 它仅仅是对象属性顺序上的差异还是更多?

In Python 3, dictionary order is dependent on hash randomisation. 在Python 3中,字典顺序取决于哈希随机化。 Each time you start your interpreter, a different, random hash seed is used. 每次启动解释器时,都会使用不同的随机哈希种子。 If you were to print the dictionary, you'd see the different ordering too: 如果要打印字典,您也会看到不同的顺序:

$ bin/python -c "o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}; print(o)"
{'first': 1, 'ls': [1, 2, 3], 'second': 2, 'third': 3}
$ bin/python -c "o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}; print(o)"
{'ls': [1, 2, 3], 'third': 3, 'first': 1, 'second': 2}
$ bin/python -c "o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}; print(o)"
{'second': 2, 'ls': [1, 2, 3], 'third': 3, 'first': 1}

Python uses a random seed to prevent certain types of Denial of Service attacks against programs parsing incoming user data into dictionaries, such as web servers; Python使用随机种子来防止针对将传入的用户数据解析为字典(例如Web服务器)的程序的某些类型的拒绝服务攻击。 such an attack could otherwise predict when two strings would cause a hash collision in a dictionary and feed Python values that do nothing but create collisions, slowing down a Python program to a crawl. 否则,这样的攻击可能会预测两个字符串何时会在字典中引起哈希冲突,并馈给Python值,这些值只会产生冲突,从而使Python程序的爬网速度变慢。

You can set the seed to a fixed value with the PYTHONHASHSEED environment variable , or you can disable hash randomisation altogether: 您可以使用PYTHONHASHSEED环境变量将种子设置为固定值,也可以完全禁用哈希随机化:

The integer must be a decimal number in the range [0,4294967295]. 整数必须是[0,4294967295]范围内的十进制数。 Specifying the value 0 will disable hash randomization. 指定值0将禁用哈希随机化。

$ PYTHONHASHSEED=0 bin/python -c "o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}; print(o)"
{'third': 3, 'ls': [1, 2, 3], 'first': 1, 'second': 2}
$ PYTHONHASHSEED=0 bin/python -c "o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}; print(o)"
{'third': 3, 'ls': [1, 2, 3], 'first': 1, 'second': 2}

Also see: Why is the order in dictionaries and sets arbitrary? 另请参阅: 为什么字典和集合中的顺序是任意的?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM