简体   繁体   English

解析和验证YAML配置文件的最佳方法

[英]the best way to parse and validate YAML configuration file

We have project which stores settings in YAML (settings file is generated by ansible scripts). 我们有一个将设置存储在YAML中的项目(设置文件由ansible脚本生成)。 Now we are using pyyaml to parse YAML format and marshmallow to validate settings. 现在我们正在使用pyyaml解析YAML格式,并使用棉花糖来验证设置。 I'm pretty happy with storing setting in YAML, but I don't think marshmellow is the tool I need (schemas are hard to read, I do not need serialization for settings, want something like xsd). 我对在YAML中存储设置非常满意,但是我不认为marshmellow是我需要的工具(方案很难阅读,我不需要序列化设置,需要xsd之类的东西)。 So what are the best practices of validating settings in project, maybe there is language independent way? 那么验证项目中设置的最佳实践是什么,也许有独立于语言的方法? (we are using python 2.7) (我们正在使用python 2.7)

YAML settings: YAML设置:

successive:
  worker:
    cds_process_number: 0 # positive integer or zero
    spider_interval: 10 # positive integer
    run_worker_sh: /home/lmakeev/CDS/releases/master/scripts/run_worker.sh # OS path
    allow:
      - "*" # regular expression
    deny:
      - "^[A-Z]{3}_.+$" # regular expression

A schema description is a language of its own, with its own syntax and idiosyncrasies you have to learn. 模式描述是一种自己的语言,具有您必须学习的语法和特质。 And you have to maintain its "programs" against which your YAML is verified if your requirements change. 而且,如果您的需求发生变化,则必须维护其“程序”,以对YAML进行验证。

If you are already working with YAML and are familiar with Python you can use YAML's tag facility to check objects at parse time. 如果您已经在使用YAML并且熟悉Python,则可以使用YAML的标记功能在解析时检查对象。

Assuming you have a file input.yaml : 假设您有一个文件input.yaml

successive:
  worker:
    cds_process_number: !nonneg 0
    spider_interval: !pos 10
    run_worker_sh: !path /home/lmakeev/CDS/releases/master/scripts/run_worker.sh
    allow:
      - !regex "*"
    deny:
      - !regex "^[A-Z]{3}_.+$"

(your example file with the comments removed and tags inserted), you can create and register four classes that check the values using the following program¹: (您的示例文件中已删除注释并插入了标签),您可以使用以下程序¹创建并注册四个用于检查值的类:

import sys
import os
import re
import ruamel.yaml
import pathlib

class NonNeg:
    yaml_tag = u"!nonneg"

    @classmethod
    def from_yaml(cls, constructor, node):
        val = int(node.value)   # this creates/returns an int
        assert val >= 0
        return val

class Pos(int):
    yaml_tag = u"!pos"

    @classmethod
    def from_yaml(cls, constructor, node):
        val = cls(node.value)  # this creates/return a Pos()
        assert val > 0
        return val

class Path:
    yaml_tag = u"!path"

    @classmethod
    def from_yaml(cls, constructor, node):
        val = pathlib.Path(node.value)
        assert os.path.exists(val)
        return val


class Regex:
    yaml_tag = u"!regex"
    def __init__(self, val, comp):
        # store original string and compile() of that string
        self._val = val
        self._compiled = comp

    @classmethod
    def from_yaml(cls, constructor, node):
        val = str(node.value)
        try:
            comp = re.compile(val)
        except Exception as e:
            comp = None
            print("Incorrect regex", node.start_mark)
            print("  ", node.tag, node.value)
        return cls(val, comp)


yaml = ruamel.yaml.YAML(typ="safe")
yaml.register_class(NonNeg)
yaml.register_class(Pos)
yaml.register_class(Path)
yaml.register_class(Regex)

data = yaml.load(pathlib.Path('input.yaml'))

The actual checks in the individual from_yaml classmethods should be adapted to your needs (I had to remove the assert for the Path, as I don't have that file). 单独的from_yaml类方法中的实际检查应适应您的需要(由于没有该文件,因此我必须删除Path的断言)。

If you run the above you'll note that it prints: 如果运行以上命令,您会注意到它会打印:

Incorrect regex   in "input.yaml", line 7, column 9
   !regex *

because "*" is not a valid regular expression. 因为"*"不是有效的正则表达式。 Did you mean: ".*" ? 您是说: ".*"吗?


¹ This was done using ruamel.yaml , a YAML 1.2 parser, of which I am the author. ¹ 这是使用yamel 1.2解析器ruamel.yaml完成的,我是作者。 You can achieve the same results with PyYAML, eg by subclassing ObjectDict (which is unsafe by default, so make sure you correct that in your code) 您可以使用PyYAML达到相同的结果,例如通过将ObjectDict子类化(默认情况下是不安全的,因此请确保在您的代码中对其进行更正)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM