简体   繁体   中英

Python json module generates non-unique keys

According to JSON specification https://tools.ietf.org/html/rfc8259 object's keys should be unique

  1. Objects

    An object structure is represented as a pair of curly brackets
    surrounding zero or more name/value pairs (or members). A name is a
    string. A single colon comes after each name, separating the name
    from the value. A single comma separates a value from a following
    name. The names within an object SHOULD be unique .

But it's possible to create json object with two same keys

Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609] on linux
>>> import json
>>> json.dumps({1: 'value1', "1": 'value2'})
'{"1": "value1", "1": "value2"}'

Is it an error?

In the JSON spec , the object (like a dict) is:

An object structure is represented as a pair of curly brackets surrounding zero or more name/value pairs (or members). A name is a string.

Emphasis mine. Python's json.dumps is very forgiving with the input object. It will implicitly convert integer keys to strings, and this can result in data loss / key collisions just as you've seen here. It also breaks the round trip loads(dumps(d)) .

If the data loss is a concern in your context, consider to use a stricter json library, eg

>>> import demjson  # pip install demjson
>>> demjson.encode({1: 'value1', "1": 'value2'}, strict=True)
# JSONEncodeError: ('object properties (dictionary keys) must be strings in strict JSON', 1)

Is it error?

In my opinion, yes. I've seen a lot of bugs caused by this, and would prefer if the stdlib json.dumps was strict by default, with an opt-in keyword argument for enabling any implicit conversions. However, the chances of this getting changed in Python are approximately zero.

The JSON RFC says "keys SHOULD be unique." RFCs have a very specific meaning for "SHOULD." From https://tools.ietf.org/html/rfc2119 :

SHOULD: This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.

So JSON is not forbidden to have duplicate keys, though it is undesirable, and I would recommend against it.

If you want to check that your dict is good, you can use this test:

def keys_are_unique(d):
    return len(d) == len(set(str(k) for k in d))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM