简体   繁体   English

Python“ yaml”模块从JSON格式转换意外的YAML

[英]Python “yaml” module converting unexpected YAML from JSON format

I'm trying to convert JSON data to YAML format but getting an unexpected YAML output 我正在尝试将JSON数据转换为YAML格式,但是得到了意外的YAML输出

Used online tools to convert JSON to YAML which gives as expected YAML output. 使用在线工具将JSON转换为YAML,从而提供预期的YAML输出。 But when same JSON used in the below Python code, getting an unexpected different result. 但是,当以下Python代码中使用相同的JSON时,会得到意外的不同结果。

import yaml                                                                     

job_template = [                                                                
  {                                                                             
    "job-template": {                                                           
      "name": "{name}_job",                                                     
      "description": "job description",                                         
      "project-type": "multibranch",                                            
      "number-to-keep": 30,                                                     
      "days-to-keep": 30,                                                       
      "scm": [                                                                  
        {                                                                       
          "git": {                                                              
            "url": "{git_url}"                                                  
          }                                                                     
        }                                                                       
      ]                                                                         
    }                                                                           
  }                                                                             
]                                                                               

yaml.dump(job_template, open("job_template.yaml", "w"))   

Expecting below YAML data: 预期低于YAML数据:

- job-template:
    name: "{name}_job"
    description: job description
    project-type: multibranch
    number-to-keep: 30
    days-to-keep: 30
    scm:
    - git:
        url: "{git_url}"

Getting below YAML format: 取得以下YAML格式:

 - job-template:
     days-to-keep: 30
     description: job description
     name: '{name}_job'
     number-to-keep: 30
     project-type: multibranch
     scm:
     - git: {url: '{git_url}'}

Use default_flow_style=False 使用default_flow_style=False

Ex: 例如:

import yaml                                                                     

job_template = [                                                                
  {                                                                             
    "job-template": {                                                           
      "name": "{name}_job",                                                     
      "description": "job description",                                         
      "project-type": "multibranch",                                            
      "number-to-keep": 30,                                                     
      "days-to-keep": 30,                                                       
      "scm": [                                                                  
        {                                                                       
          "git": {                                                              
            "url": "{git_url}"                                                  
          }                                                                     
        }                                                                       
      ]                                                                         
    }                                                                           
  }                                                                             
]                                                                               

yaml.dump(job_template, open("job_template.yaml", "w"), default_flow_style=False)  

The problem is in the Python code: a dict is an unordered container. 问题出在Python代码中: dict是一个无序的容器。 pprint just gives the same order of your yaml output: pprint只是给出了yaml输出的相同顺序:

>>> pprint.pprint(job_template)
[{'job-template': {'days-to-keep': 30,
                   'description': 'job description',
                   'name': '{name}_job',
                   'number-to-keep': 30,
                   'project-type': 'multibranch',
                   'scm': [{'git': {'url': '{git_url}'}}]}}]

If the question was about the style of the representation for the last level dict {"url": "{git_url}"} , the answer has been given by @Rakesh 如果问题是关于最后一级字典{"url": "{git_url}"} ,则答案已由@Rakesh给出。

The change of ordering in PyYAML is an impediment to round-trip edits to YAML files and a number of other parsers have sought to fix that. PyYAML中顺序的更改阻碍了对YAML文件的双向编辑,并且许多其他解析器都试图解决该问题。

One worth looking at is Ruamel.yaml which says on its overview page : 值得一看的是Ruamel.yaml ,它在其概述页面显示

block style and key ordering are kept, so you can diff the round-tripped source

A code example provided by the author demonstrates this: 作者提供的代码示例演示了这一点:

import sys
import ruamel.yaml as yaml

yaml_str = """\
3: abc
conf:
    10: def
    3: gij     # h is missing
more:
- what
- else
"""

data = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)
data['conf'][10] = 'klm'
data['conf'][3] = 'jig'
yaml.dump(data, sys.stdout, Dumper=yaml.RoundTripDumper)
will give you:

3: abc
conf:
  10: klm
  3: jig       # h is missing
more:
- what
- else

This is more fully discussed here . 这是更充分的讨论在这里 It is described as a drop-in replacement for PyYAML so should be easy to experiment with in your environment. 它被描述为是PyYAML的直接替代品,因此应该易于在您的环境中进行试验。

First all you should just leave your job template in a JSON file, eg input.json .: 首先,您只需要将作业模板保留在JSON文件中,例如input.json

[                                                                
  {                                                                             
    "job-template": {                                                           
      "name": "{name}_job",                                                     
      "description": "job description",                                         
      "project-type": "multibranch",                                            
      "number-to-keep": 30,                                                     
      "days-to-keep": 30,                                                       
      "scm": [                                                                  
        {                                                                       
          "git": {                                                              
            "url": "{git_url}"                                                  
          }                                                                     
        }                                                                       
      ]                                                                         
    }                                                                           
  }                                                                             
]

That way you can more easily adapt your script to process different files. 这样,您可以更轻松地调整脚本来处理不同的文件。 And doing so also guarantees that the keys in your JSON objects are ordered, something not guaranteed when you include the JSON as dicts & lists in your code, at least not for all current versions of Python 这样做还保证了JSON对象中的键是有序的,当您在代码中包含JSON作为dicts和list时,这是不能保证的,至少对于所有当前版本的Python而言,这都不是保证的

Then because YAML 1.2 (spec issued in 2009) is a superset of YAML, you can just use a YAML 1.2 library that preserves key order when loading-dumping to convert this to the format you want. 然后,由于YAML 1.2(规范于2009年发布)是YAML的超集,因此您可以使用YAML 1.2库,该库在加载转储时保留键顺序以将其转换为所需的格式。 Since PyYAML is still stuck at the 2005 issued YAML 1.1 specification, you cannot use that, but you can use ruamel.yaml (disclaimer I am the author of that package). 由于PyYAML仍然停留在2005年发布的YAML 1.1规范中,因此您不能使用它,但是可以使用ruamel.yaml (免责声明,我是该软件包的作者)。

The only "problem" is that ruamel.yaml will also preserve the (flow) style on your input. 唯一的“问题”是ruamel.yaml还将保留输入中的(流)样式。 That is exactly what you don't want. 那正是您不想要的。

So you have to recursively walk over the data-structure and change the attribute containing that information: 因此,您必须递归遍历数据结构并更改包含该信息的属性:

import sys
import ruamel.yaml

def block_style(d):
    if isinstance(d, dict):
        d.fa.set_block_style()
        for key, value in d. items():
            try:
                if '{' in value:
                    d[key] = ruamel.yaml.scalarstring.DoubleQuotedScalarString(value)
            except TypeError:
                pass
            block_style(value)
    elif isinstance(d, list):
        d.fa.set_block_style()
        for elem in d:
            block_style(elem)

yaml = ruamel.yaml.YAML()

with open('input.json') as fp:
    data = yaml.load(fp)

block_style(data)

yaml.dump(data, sys.stdout)

which gives: 这使:

- job-template:
    name: "{name}_job"
    description: job description
    project-type: multibranch
    number-to-keep: 30
    days-to-keep: 30
    scm:
    - git:
        url: "{git_url}"

The above works equally well for Python2 and Python3 上面对于Python2和Python3同样有效

The extra code testing for '{' is to enforce double quotes around the strings that cannot be represented as plain scalars. '{'的额外代码测试是对不能用普通标量表示的字符串强制使用双引号。 By default ruamel.yaml would use single quoted scalars if the extra escape sequences available in YAML double quoted scalars are not needed to represent the string. 默认情况下,如果不需要使用YAML双引号标量中可用的额外转义序列来表示字符串, ruamel.yaml将使用单引号标量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM