简体   繁体   中英

Create reversible YAML from JSON in python

While trying to create YAML from a JSON in python, using the PyYAML library, I am able to convert the JSON into YAML. However, in the YAML I receive as a result, all the brackets of JSON are unfolded whereas I want to retain few square brackets (lists) from JSON to converted YAML. How can I request this library call to not unfold lists from JSON into YAML, but rather retain it as a list?

A snapshot of my issue follows:

import yaml
import json

original_json = {'a': {'next': ['b'], 'prev': []},
 'b': {'next': ['c'], 'prev': ['a']},
 'c': {'next': ['d', 'e'], 'prev': ['b']},
 'd': {'next': [], 'prev': ['c']},
 'e': {'next': ['f'], 'prev': ['c']},
 'f': {'next': [], 'prev': ['e']}}

obtained_yaml = yaml.dump(yaml.load(json.dumps(original_json)), default_flow_style=False)

# obtained_yaml looks like
#
# a:
#   next:
#   - b
#   prev: []
# b:
#   next:
#   - c
#   prev:
#   - a
# c:
#   next:
#   - d
#   - e
#   prev:
#   - b
# d:
#   next: []
#   prev:
#   - c
# e:
#   next:
#   - f
#   prev:
#   - c
# f:
#   next: []
#   prev:
#   - e

# expected yaml should look like
#
# a:
#   next:
#   - ["b"]
#   prev: []
# b:
#   next:
#   - ["c"]
#   prev:
#   - ["a"]
# c:
#   next:
#   - ["d"]
#   - ["e"]
#   prev:
#   - ["b"]
# d:
#   next: []
#   prev:
#   - ["c"]
# e:
#   next:
#   - ["f"]
#   prev:
#   - ["c"]
# f:
#   next: []
#   prev:
#   - ["e"]

I tried few ways to solve this out but all that did not work in the way expected json should come out. Need suggestions on how to get it done.

Yaml syntax defines a different list structure where members of a list are lines beginning at the same indentation level starting with a - (a dash and a space). If you want to keep the brackets, you will need to cast your list into a str - But then you will lose the ability to reverse the YAML into JSON.

Here's an example where you can see that even if you can get ["a"] into [["a"]] - YAML tranforms it into a double indented list:

In [4]: import yaml
   ...: import json
   ...: import collections
   ...: original_json = {'a': {'next': ['b'], 'prev': []},
   ...:  'b': {'next': ['c'], 'prev': ['a']},
   ...:  'c': {'next': ['d', 'e'], 'prev': ['b']},
   ...:  'd': {'next': [], 'prev': ['c']},
   ...:  'e': {'next': ['f'], 'prev': ['c']},
   ...:  'f': {'next': [], 'prev': ['e']}}
   ...:
   ...: mod_json = collections.defaultdict(dict)
   ...: for k, v in original_json.items():
   ...:     mod_json[k]["next"] = [v["next"]]
   ...:     mod_json[k]["prev"] = [v["prev"]]
   ...: obtained_yaml = yaml.dump(yaml.load(json.dumps(mod_json)), default_flow_style=False)
   ...:
   ...:

In [5]: obtained_yaml
Out[5]: 'a:\n  next:\n  - - b\n  prev:\n  - []\nb:\n  next:\n  - - c\n  prev:\n  - - a\nc:\n  next:\n  - - d\n    - e\n  prev:\n  - - b\nd:\n  next:\n  - []\n  prev:\n  - - c\ne:\n  next:\n  - - f\n  prev:\n  - - c\nf:\n  next:\n  - []\n  prev:\n  - - e\n'

Only YAML 1.2 is a superset of JSON, YAML 1.1 is not and although YAML 1.2 was released in 2009, unfortunately PyYAML has not been updated since then. Your example is a JSON subset that is compatible with YAML 1.1, but in general it is not a good idea to try and use PyYAML for this.

There are other native libraries for Python, another one is ruamel.yaml (disclaimer: I am the author of that package) and that implements YAML 1.2 and gives you full control over block vs flow style dumping of individual collections. Of course you still have the general YAML restriction that you cannot have a block style collection within a flow style collection).

PyYAML, and ruamel.yaml in non-round-trip-mode, only allow you to have all block, or all flow, or all block with leaf-nodes in flow style. But the default, round-trip-mode, allows finer grained control using the .fa attribute on collections:

import sys
import json
import ruamel.yaml


original_json = {'a': {'next': ['b'], 'prev': []},
 'b': {'next': ['c'], 'prev': ['a']},
 'c': {'next': ['d', 'e'], 'prev': ['b']},
 'd': {'next': [], 'prev': ['c']},
 'e': {'next': ['f'], 'prev': ['c']},
 'f': {'next': [], 'prev': ['e']}}

json_string = json.dumps(original_json)

yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
# yaml.preserve_quotes = True
data = yaml.load(json_string)

# the following sets flow-style for the root level mapping only
data.fa.set_block_style()
yaml.dump(data, sys.stdout)

which gives:

a: {next: [b], prev: []}
b: {next: [c], prev: [a]}
c: {next: [d, e], prev: [b]}
d: {next: [], prev: [c]}
e: {next: [f], prev: [c]}
f: {next: [], prev: [e]}

you can of course recursively traverse your data structure and call .fa.set_block_style() depending on any criteria you want.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM