简体   繁体   中英

Python + JSON serialization for MD5 hash - how can I guarantee that two equivalent objects will serialize to exactly the same string?

I need to take an md5 hash of the contents of a dict or list and I want to ensure that two equivalent structures will give me the same hash result.

My approach thus far has been to carefully define the order of the structures and to sort the various lists and dictionaries that they contain prior to running them through json.dumps() .

As my structures get more complex, however, this is becoming laborious and error prone, and in any case I was never sure it was working 100% of the time or just 98% of the time.

Just curious if anyone has a quick solution for this? Is there an option I can set in the json module to sort objects completely? Or some other trick I can use to do a complete comparison of the information in two structures and return a hash guaranteed to be unique to it?

I only need the strings (and then the md5) to come out the same when I serialize the objects -- I'm not concerned about deserializing for this use case.

JSON output by default is non-deterministic simply because the results of __hash__ are salted for str (key values for typical JSON objects) to prevent a DoS vector (see the notes in documentation). For this reason you need to call json.dumps with sort_keys set to True.

>>> import json
>>> d = {'this': 'This word', 'that': 'That other word', 'other': 'foo'}
>>> json.dumps(d)
'{"this": "This word", "other": "foo", "that": "That other word"}'
>>> json.dumps(d, sort_keys=True)
'{"other": "foo", "that": "That other word", "this": "This word"}'

For objects that end up serialized into a list (ie list , tuple ) you will need to ensure the ordering is done in the expected way because by definition lists are not ordered in any particular way (ordering of the elements in those collections will be persistent in the position they have been placed/modified by the program itself).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM