简体   繁体   中英

Python JSONDecoder custom translation of null type

In python the JSONDecoder preforms the translation of null to None by default as seen below. How can I change that translation of null -> None to something different. ie null -> 'Cat'

class json.JSONDecoder([encoding[, object_hook[, parse_float[, parse_int[, parse_constant[, strict[, object_pairs_hook]]]]]]])

Simple JSON decoder.

Performs the following translations in decoding by default:
  JSON  Python
  object    dict
  array     list
  string    unicode
  number (int)  int, long
  number (real)     float
  true  True
  false     False
  null  None

I would like json.loads({"field1":null, "field2": "data!"})

to return {u'field2': u'data!', u'field1': u'Cat'}

UPDATE 12/30/2014

The easiest way to achieve this would be to use the object_hook callback of the JSONDecoder as described in my old answer below. But, since this would require an extra function call for each key-value pair in the data, this might have an impact on performance.

So, if you truly want to change how json handles None, you need to dig a little deeper. The JSONDecoder uses a scanner to find certain tokens in the JSON input. Unfortunately, this is a function and not a class, therefore subclassing is not that easy. The scanner function is called py_make_scanner and can be found in json/scanner.py . It is basically a function that gets a JSONDecoder as an argument and returns a scan_once function. The scan_once function receives a string and an index of the current scanner position.

A simple customized scanner function could look like this:

import json

def make_my_scanner(context):
    # reference to actual scanner
    interal_scanner = json.scanner.py_make_scanner(context)

    # some references for the _scan_once function below
    parse_object = context.parse_object
    parse_array = context.parse_array
    parse_string = context.parse_string
    encoding = context.encoding
    strict = context.strict
    object_hook = context.object_hook
    object_pairs_hook = context.object_pairs_hook

    # customized _scan_once
    def _scan_once(string, idx):
        try:
            nextchar = string[idx]
        except IndexError:
            raise StopIteration

        # override some parse_** calls with the correct _scan_once
        if nextchar == '"':
            return parse_string(string, idx + 1, encoding, strict)
        elif nextchar == '{':
            return parse_object((string, idx + 1), encoding, strict,
                _scan_once, object_hook, object_pairs_hook)
        elif nextchar == '[':
            return parse_array((string, idx + 1), _scan_once)
        elif nextchar == 'n' and string[idx:idx + 4] == 'null':
            return 'Cat', idx + 4

        # invoke default scanner
        return interal_scanner(string, idx)

    return _scan_once

Now we just need a JSONDecoder subclass that will use our scanner instead of the default scanner:

class MyJSONDecoder(json.JSONDecoder):
    def __init__(self, encoding=None, object_hook=None, parse_float=None,
            parse_int=None, parse_constant=None, strict=True,
            object_pairs_hook=None):

        json.JSONDecoder.__init__(self, encoding, object_hook, parse_float, parse_int, parse_constant, strict, object_pairs_hook)

        # override scanner
        self.scan_once = make_my_scanner(self)

And then use it like this:

decoder = MyJSONDecoder()
print decoder.decode('{"field1":null, "field2": "data!"}')

Old answer, but still valid if you do not care about the performance impact of another function call:

You need to create a JSONDecoder object with a special object_hook method:

import json

def parse_object(o):
    for key in o:
        if o[key] is None:
            o[key] = 'Cat'
    return o

decoder = json.JSONDecoder(object_hook=parse_object)

print decoder.decode('{"field1":null, "field2": "data!"}')
# that will print: {u'field2': u'data!', u'field1': u'Cat'}

According to the Python documentation of the json module :

object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict.

So parse_object will get a dictionary that can be manipulated by exchanging all None values with 'Cat'. The returned object/dictionary will then be used in the output.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM