简体   繁体   English

如何在 Python 中解析 YAML 文件

[英]How can I parse a YAML file in Python

如何在 Python 中解析 YAML 文件?

The easiest and purest method without relying on C headers is PyYaml ( documentation ), which can be installed via pip install pyyaml :不依赖 C 头文件的最简单和最纯粹的方法是 PyYaml(文档),可以通过pip install pyyaml

#!/usr/bin/env python

import yaml

with open("example.yaml", 'r') as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

And that's it.就是这样。 A plain yaml.load() function also exists, but yaml.safe_load() should always be preferred unless you explicitly need the arbitrary object serialization/deserialization provided in order to avoid introducing the possibility for arbitrary code execution.一个普通的yaml.load()函数也存在,但yaml.safe_load()应该始终是首选,除非您明确需要提供的任意对象序列化/反序列化以避免引入任意代码执行的可能性。

Note the PyYaml project supports versions up through the YAML 1.1 specification .请注意,PyYaml 项目支持通过YAML 1.1 规范的版本。 If YAML 1.2 specification support is needed, see ruamel.yaml as noted in this answer .如果需要YAML 1.2 规范支持,请参阅本答案中所述的ruamel.yaml

Read & Write YAML files with Python 2+3 (and unicode)使用 Python 2+3(和 unicode)读写 YAML 文件

# -*- coding: utf-8 -*-
import yaml
import io

# Define data
data = {
    'a list': [
        1, 
        42, 
        3.141, 
        1337, 
        'help', 
        u'€'
    ],
    'a string': 'bla',
    'another dict': {
        'foo': 'bar',
        'key': 'value',
        'the answer': 42
    }
}

# Write YAML file
with io.open('data.yaml', 'w', encoding='utf8') as outfile:
    yaml.dump(data, outfile, default_flow_style=False, allow_unicode=True)

# Read YAML file
with open("data.yaml", 'r') as stream:
    data_loaded = yaml.safe_load(stream)

print(data == data_loaded)

Created YAML file创建的 YAML 文件

a list:
- 1
- 42
- 3.141
- 1337
- help
- €
a string: bla
another dict:
  foo: bar
  key: value
  the answer: 42

Common file endings常见的文件结尾

.yml and .yaml .yml.yaml

Alternatives备择方案

For your application, the following might be important:对于您的应用程序,以下内容可能很重要:

  • Support by other programming languages其他编程语言的支持
  • Reading / writing performance读/写性能
  • Compactness (file size)紧凑性(文件大小)

See also: Comparison of data serialization formats另请参阅: 数据序列化格式的比较

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python如果您正在寻找一种制作配置文件的方法,您可能需要阅读我的短文Python 中的配置文件

If you have YAML that conforms to the YAML 1.2 specification (released 2009) then you should use ruamel.yaml (disclaimer: I am the author of that package).如果您的 YAML 符合YAML 1.2 规范(2009 年发布),那么您应该使用ruamel.yaml (免责声明:我是该软件包的作者)。 It is essentially a superset of PyYAML, which supports most of YAML 1.1 (from 2005).它本质上是 PyYAML 的超集,它支持大部分 YAML 1.1(从 2005 年开始)。

If you want to be able to preserve your comments when round-tripping, you certainly should use ruamel.yaml.如果您希望能够在往返时保留您的评论,您当然应该使用 ruamel.yaml。

Upgrading @Jon's example is easy:升级@Jon 的例子很简单:

import ruamel.yaml as yaml

with open("example.yaml") as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

Use safe_load() unless you really have full control over the input, need it (seldom the case) and know what you are doing.使用safe_load()除非你真的可以完全控制输入,需要它(很少出现这种情况)并且知道你在做什么。

If you are using pathlib Path for manipulating files, you are better of using the new API ruamel.yaml provides:如果你使用 pathlib Path来操作文件,你最好使用新的 API ruamel.yaml 提供:

from ruamel.yaml import YAML
from pathlib import Path

path = Path('example.yaml')
yaml = YAML(typ='safe')
data = yaml.load(path)

First install pyyaml using pip3.首先使用pip3安装pyyaml。

Then import yaml module and load the file into a dictionary called 'my_dict':然后导入 yaml 模块并将文件加载到名为“my_dict”的字典中:

import yaml
with open('filename.yaml') as f:
    my_dict = yaml.safe_load(f)

That's all you need.这就是你所需要的。 Now the entire yaml file is in 'my_dict' dictionary.现在整个 yaml 文件都在“my_dict”字典中。

Example:例子:


defaults.yaml默认值.yaml

url: https://www.google.com

environment.py环境.py

from ruamel import yaml

data = yaml.safe_load(open('defaults.yaml'))
data['url']

To access any element of a list in a YAML file like this:要访问 YAML 文件中列表的任何元素,如下所示:

global:
  registry:
    url: dtr-:5000/
    repoPath:
  dbConnectionString: jdbc:oracle:thin:@x.x.x.x:1521:abcd

You can use following python script:您可以使用以下 python 脚本:

import yaml

with open("/some/path/to/yaml.file", 'r') as f:
    valuesYaml = yaml.load(f, Loader=yaml.FullLoader)

print(valuesYaml['global']['dbConnectionString'])

I use ruamel.yaml .我使用ruamel.yaml Details & debatehere .详情和辩论在这里

from ruamel import yaml

with open(filename, 'r') as fp:
    read_data = yaml.load(fp)

Usage of ruamel.yaml is compatible (with some simple solvable problems) with old usages of PyYAML and as it is stated in link I provided, use ruamel.yaml 的使用与 PyYAML 的旧用法兼容(有一些简单的可解决问题),正如我提供的链接中所述,使用

from ruamel import yaml

instead of代替

import yaml

and it will fix most of your problems.它将解决您的大部分问题。

EDIT : PyYAML is not dead as it turns out, it's just maintained in a different place.编辑:事实证明,PyYAML 并没有死,它只是在不同的地方维护。

#!/usr/bin/env python

import sys
import yaml

def main(argv):

    with open(argv[0]) as stream:
        try:
            #print(yaml.load(stream))
            return 0
        except yaml.YAMLError as exc:
            print(exc)
            return 1

if __name__ == "__main__":
    sys.exit(main(sys.argv[1:]))

read_yaml_file function returning all data into dictionary. read_yaml_file 函数将所有数据返回到字典中。

def read_yaml_file(full_path=None, relative_path=None):
   if relative_path is not None:
        resource_file_location_local = ProjectPaths.get_project_root_path() + relative_path
else:
    resource_file_location_local = full_path

with open(resource_file_location_local, 'r') as stream:
    try:
        file_artifacts = yaml.safe_load(stream)
    except yaml.YAMLError as exc:
        print(exc)
return dict(file_artifacts.items())

Considering the above mentioned answers, all of which are good, there is a Python package available to smartly construct objects from YAML/JSON/dicts, and is actively being developed and expanded.考虑到上面提到的答案,所有这些都很好,有一个 Python 包可用于从 YAML/JSON/dicts 智能构建对象,并且正在积极开发和扩展。 ( full disclosure, I am a co-author of this package , see here ) 完全披露,我是这个包的合著者,见这里

Install:安装:

pip install pickle-rick

Use:用:

Define a YAML or JSON string (or file).定义一个 YAML 或 JSON 字符串(或文件)。

BASIC:
 text: test
 dictionary:
   one: 1
   two: 2
 number: 2
 list:
   - one
   - two
   - four
   - name: John
     age: 20
 USERNAME:
   type: env
   load: USERNAME
 callable_lambda:
   type: lambda
   load: "lambda: print('hell world!')"
 datenow:
   type: lambda
   import:
     - "from datetime import datetime as dd"
   load: "lambda: print(dd.utcnow().strftime('%Y-%m-%d'))"
 test_function:
   type: function
   name: test_function
   args:
     x: 7
     y: null
     s: hello world
     any:
       - 1
       - hello
   import:
     - "math"
   load: >
     def test(x, y, s, any):
       print(math.e)
       iii = 111
       print(iii)
       print(x,s)
       if y:
         print(type(y))
       else:
         print(y)
       for i in any:
         print(i)

Then use it as an object.然后将其用作对象。

>> from pickle_rick import PickleRick

>> config = PickleRick('./config.yaml', deep=True, load_lambda=True)

>> config.BASIC.dictionary
{'one' : 1, 'two' : 2}

>> config.BASIC.callable_lambda()
hell world!

You can define Python functions, load additional data from other files or REST APIs, environmental variables, and then write everything out to YAML or JSON again.您可以定义 Python 函数,从其他文件或 REST API、环境变量加载其他数据,然后再次将所有内容写入 YAML 或 JSON。

This works especially well when building systems that require structured configuration files, or in notebooks as interactive structures.这在构建需要结构化配置文件的系统或在笔记本中作为交互式结构时特别有效。

There is a security note to using this.使用这个有一个安全说明。 Only load files that are trusted, as any code can be executed, thus stay clear of just loading anything without knowing what the complete contents are.只加载受信任的文件,因为任何代码都可以执行,从而避免在不知道完整内容的情况下加载任何内容。

The package is called PickleRick and is available here:该软件包称为 PickleRick,可在此处获得:

Suggestion: Use yq建议:使用 yq

I'm Not sure how it wasn't suggested before, but I would highly recommend using yq which is a jq wrapper for YAML.我不确定以前是怎么建议的,但我强烈建议使用yq ,它是 YAML 的jq包装器。

yq uses jq like syntax but works with yaml files as well as json. yq 使用类似 jq 的语法,但适用于 yaml 文件和 json。


Examples:例子:

1 ) Read a value: 1)读取一个值:

yq e '.a.b[0].c' file.yaml

2 ) Pipe from STDIN: 2 ) 来自 STDIN 的管道:

cat file.yaml | yq e '.a.b[0].c' -

3 ) Update a yaml file, inplace 3)更新一个yaml文件,就地

yq e -i '.a.b[0].c = "cool"' file.yaml

4 ) Update using environment variables: 4)使用环境变量更新:

NAME=mike yq e -i '.a.b[0].c = strenv(NAME)' file.yaml

5 ) Merge multiple files: 5)合并多个文件:

yq ea '. as $item ireduce ({}; . * $item )' path/to/*.yml

6 ) Multiple updates to a yaml file: 6 ) 对一个 yaml 文件的多次更新:

yq e -i '
  .a.b[0].c = "cool" |
  .x.y.z = "foobar" |
  .person.name = strenv(NAME)
' file.yaml

(*) Read more on how to parse fields from yaml with based on jq filters . (*) 阅读更多关于如何使用基于jq 过滤器解析来自 yaml 的字段。


Additional references:附加参考:

https://github.com/mikefarah/yq/#install https://github.com/mikefarah/yq/#install

https://github.com/kislyuk/yq https://github.com/kislyuk/yq

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM