简体   繁体   English

YAML 与 Python 配置/参数文件(但也可能与 JSON 与 XML)

[英]YAML vs Python configuration/parameter files (but perhaps also vs JSON vs XML)

I see Python used to do a fair amount of code generation for C/C++ header and source files.我看到 Python 过去常常为 C/C++ 头文件和源文件生成大量代码。 Usually, the input files which store parameters are in JSON or YAML format, although most of what I see is YAML.通常,存储参数的输入文件是 JSON 或 YAML 格式,尽管我看到的大部分是 YAML。 However, why not just use Python files directly?但是,为什么不直接使用 Python 文件呢? Why use YAML at all in this case?为什么在这种情况下完全使用 YAML?

That also got me thinking: since Python is a scripted language, its files, when containing only data and data structures, could literally be used the same as XML, JSON, YAML, etc. Do people do this?这也让我想到:因为 Python 是一种脚本语言,它的文件,当只包含数据和数据结构时,实际上可以像 XML、JSON、YAML 等一样使用。人们会这样做吗? Is there a good use case for it?它有很好的用例吗?

What if I want to import a configuration file into a C or C++ program?如果我想将配置文件导入到 C 或 C++ 程序中怎么办? What about into a Python program?进入 Python 程序怎么样? In the Python case it seems to me there is no sense in using YAML at all, as you can just store your configuration parameters and variables in pure Python files.在 Python 的情况下,在我看来,使用 YAML 根本没有意义,因为您可以将配置参数和变量存储在纯 Python 文件中。 In the C or C++ case, it seems to me you could still store your data in Python files and then just have a Python script import that and auto-generate header and source files for you as part of the build process.在 C 或 C++ 的情况下,在我看来,您仍然可以将数据存储在 Python 文件中,然后只需导入 Python 脚本并在构建过程中为您自动生成头文件和源文件。 Again, perhaps there's no need for YAML or JSON in this case at all either.同样,在这种情况下,也许根本不需要 YAML 或 JSON。

Thoughts?想法?

Here's an example of storing some nested key/value hash table pairs in a YAML file:下面是在 YAML 文件中存储一些嵌套键/值哈希表对的示例:

my_params.yml: my_params.yml:

---
dict_key1:
    dict_key2:
        dict_key3a: my string message
        dict_key3b: another string message

And the same exact thing in a pure Python file:在纯 Python 文件中完全相同:

my_params.py my_params.py

data = {
    "dict_key1": {
        "dict_key2": {
            "dict_key3a": "my string message",
            "dict_key3b": "another string message",
        }
    }
}

And to read in both the YAML and Python data and print it out:并读取 YAML 和 Python 数据并将其打印出来:

import_config_file.py: import_config_file.py:

import yaml # Module for reading in YAML files
import json # Module for pretty-printing Python dictionary types
            # See: https://stackoverflow.com/a/34306670/4561887

# 1) import .yml file
with open("my_params.yml", "r") as f:
    data_yml = yaml.load(f)

# 2) import .py file
from my_params import data as data_py
# OR: Alternative method of doing the above:
# import my_params
# data_py = my_params.data

# 3) print them out
print("data_yml = ")
print(json.dumps(data_yml, indent=4))

print("\ndata_py = ")
print(json.dumps(data_py, indent=4))

Reference for using json.dumps : https://stackoverflow.com/a/34306670/4561887使用json.dumps参考: https : json.dumps

SAMPLE OUTPUT of running python3 import_config_file.py :运行python3 import_config_file.py示例输出:

data_yml = 
{
    "dict_key1": {
        "dict_key2": {
            "dict_key3a": "my string message",
            "dict_key3b": "another string message"
        }
    }
}

data_py = 
{
    "dict_key1": {
        "dict_key2": {
            "dict_key3a": "my string message",
            "dict_key3b": "another string message"
        }
    }
}

Yes people do this, and have been doing this for years.是的,人们这样做,并且多年来一直这样做。

But many make the mistake you do and make it unsafe to by using import my_params.py .但是很多my_params.py了你做的错误,并通过使用 import my_params.py使其不安全。 That would be the same as loading YAML using YAML(typ='unsafe') in ruamel.yaml (or yaml.load() in PyYAML, which is unsafe).这与使用ruamel.yaml YAML(typ='unsafe') (或 PyYAML 中的yaml.load() ,这是不安全的)加载 YAML 相同。

What you should do is using the ast package that comes with Python to parse your "data" structure, to make such an import safe.您应该做的是使用 Python 附带的ast包来解析您的“数据”结构,以使这样的导入安全。 My package pon has code to update these kind of structures, and in each of my __init__.py files there is such an piece of data named _package_data that is read by some code (function literal_eval ) in the setup.py for the package.我的包pon有代码来更新这些类型的结构,在我的每个__init__.py文件中,都有这样一个名为_package_data的数据,由包的setup.py中的一些代码(函数literal_eval )读取。 The ast based code in setup.py takes around ~100 lines. setup.py 中基于ast的代码大约需要 100 行。

The advantage of doing this in a structured way are the same as with using YAML: you can programmatically update the data structure (version numbers!), although I consider PON, (Python Object Notation), less readable than YAML and slightly less easy to manually update.以结构化方式执行此操作的优点与使用 YAML 相同:您可以以编程方式更新数据结构(版本号!),尽管我认为 PON(Python 对象表示法)比 YAML 可读性差,并且不太容易理解手动更新。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM