繁体   English   中英

在 ruamel.yaml 中使用自定义构造函数时,如何避免全局 state?

[英]How do I avoid global state when using custom constructors in ruamel.yaml?

我正在使用 ruamel.yaml 来解析复杂的 YAML 文档,其中某些标记节点需要特殊处理。 按照已发布示例的建议,我使用add_multi_constructor注入我的自定义解析逻辑。 问题是我需要根据外部状态动态更改注入的逻辑,但是像add_multi_constructor这样的装饰方法会修改全局 state ,这会在逻辑上不相关的实例之间引入不可接受的耦合。 这是MWE:

import ruamel.yaml

def get_loader(parameter):
    def construct_node(constructor: ruamel.yaml.Constructor, tag: str, node: ruamel.yaml.Node):
        return parameter(tag.lstrip("!"), str(node.value))

    loader = ruamel.yaml.YAML()
    loader.constructor.add_multi_constructor("", construct_node)
    return loader

foo = get_loader(lambda tag, node: f"foo: {tag}, {node}")
bar = get_loader(lambda tag, node: f"bar: {tag}, {node}")
print(foo.load("!abc 123"), bar.load("!xyz 456"), sep="\n")

Output:

bar: abc, 123
bar: xyz, 456

预期的:

foo: abc, 123
bar: xyz, 456

我做了以下解决方法,在其中动态创建新的 class 实例以打破耦合:

def get_loader(parameter):
    def construct_node(constructor: ruamel.yaml.Constructor, tag: str, node: ruamel.yaml.Node):
        return parameter(tag.lstrip("!"), str(node.value))

    # Create a new class to prevent state sharing through class attributes.
    class ConstructorWrapper(ruamel.yaml.constructor.RoundTripConstructor):
        pass

    loader = ruamel.yaml.YAML()
    loader.Constructor = ConstructorWrapper
    loader.constructor.add_multi_constructor("", construct_node)
    return loader

我的问题是:

  • 我在滥用图书馆吗? 全局影响是一个巨大的危险信号,表明我错误地使用了 API,但该库缺少任何 API 文档,所以我不确定什么是正确的方法。

  • 从 API 破损的意义上讲安全吗? 由于没有为此记录的 API,我不确定这是否可以安全投入生产。

IMO 你没有滥用图书馆,只是解决了它目前的缺点/不完整。

Before ruamel.yaml got the API with the YAML() instance, it had the function based API of PyYAML with a few extensions, and other PyYAML's problems had to be worked around in a similar unnatural way. Eg I reverted to having classes whose instances could be called (using __call__() ) on which methods could then be changed to just have access to YAML documents version parsed from a document (as ruamel.yaml supports YAML 1.2 and 1.1 and PyYAML only (部分)支持 1.1)。

但是在 ruamel.yaml 的YAML()实例下,并不是所有的都得到了改进。 The code inherited from PyYAML stores the information for the various constructors in the class attributes as lookup tables (on yaml_constructor resp yaml_multi_constructor ), and ruamel.yaml still does that (as the full old PyYAML-escque API is effectively still there, and only with 0.17 版已收到未来弃用警告)。

到目前为止,您的方法很有趣,因为您这样做:

loader.constructor.add_multi_constructor("", construct_node)

代替:

loader.Constructor.add_multi_constructor("", construct_node)

(您可能知道loader.constructor是一个在必要时实例化loader.Constructor的属性,但此答案的其他读者可能不会)

甚至:

def get_loader(parameter):
    def construct_node(constructor: ruamel.yaml.Constructor, tag: str, node: ruamel.yaml.Node):
        return parameter(tag.lstrip("!"), str(node.value))

    # Create a new class to prevent state sharing through class attributes.
    class ConstructorWrapper(ruamel.yaml.constructor.RoundTripConstructor):
        pass

    ConstructorWrapper.add_multi_constructor("", construct_node)

    loader = ruamel.yaml.YAML()
    loader.Constructor = ConstructorWrapper
    return loader

您的代码有效,是因为构造函数存储在 class 属性中,因为.add_multi_constructor()是 class 方法。

因此,就 API 破损而言,您所做的并不完全安全。 ruamel.yaml 还不是 1.0 版,可能会破坏您的代码的 (API) 更改可能会伴随任何次要的版本号更改。 您应该为您的生产代码(例如ruamel.yaml<0.18 )相应地设置您的版本依赖项,并且仅在使用带有新的次要版本号的 ruamel.yaml 版本进行测试后更新该次要版本号。


通过将类方法add_constructor()add_multi_constructor()更新为“正常”方法并在__init__()中完成查找表的初始化,可以透明地更改 class 属性的使用。 您调用实例的两个示例:

loader.constructor.add_multi_constructor("", construct_node)

将获得所需的结果,但使用以下命令在 class 上调用add_multi_constructor时,ruamel.yaml 的行为不会改变:

loader.Constructor.add_multi_constructor("", construct_node)

但是,以这种方式更改类方法add_constructor()add_multi_constructor()会影响所有代码,这恰好提供了一个实例而不是 class (并且所述代码对结果很好)。

更有可能将两个新的实例方法添加到Constructor class 和YAML()实例中,并且 class 方法将被逐步淘汰或更改以检查 ZA2F2ED4F8EBC2CBB4C21A29DZ 实例中传递的而不是在带有警告的弃用期之后(从 PyYAML 继承的全局函数add_constructor()add_multi_constructor() ) 也是如此)。

除了将您的生产代码固定在次要版本号之外,主要建议是确保您的测试代码显示PendingDeprecationWarning 如果您使用的是pytest ,这是默认情况下的情况 这应该给你足够的时间来调整你的代码以适应警告的建议。

如果 ruamel.yaml 的作者不再偷懒,他可能会为此类 API 添加/更改提供一些文档。

import ruamel.yaml
import types
import inspect


class MyConstructor(ruamel.yaml.constructor.RoundTripConstructor):
    _cls_yaml_constructors = {}
    _cls_yaml_multi_constructors = {}

    def __init__(self, *args, **kw):
        self._yaml_constructors = {
            'tag:yaml.org,2002:null': self.__class__.construct_yaml_null,
            'tag:yaml.org,2002:bool': self.__class__.construct_yaml_bool,
            'tag:yaml.org,2002:int': self.__class__.construct_yaml_int,
            'tag:yaml.org,2002:float': self.__class__.construct_yaml_float,
            'tag:yaml.org,2002:binary': self.__class__.construct_yaml_binary,
            'tag:yaml.org,2002:timestamp': self.__class__.construct_yaml_timestamp,
            'tag:yaml.org,2002:omap': self.__class__.construct_yaml_omap,
            'tag:yaml.org,2002:pairs': self.__class__.construct_yaml_pairs,
            'tag:yaml.org,2002:set': self.__class__.construct_yaml_set,
            'tag:yaml.org,2002:str': self.__class__.construct_yaml_str,
            'tag:yaml.org,2002:seq': self.__class__.construct_yaml_seq,
            'tag:yaml.org,2002:map': self.__class__.construct_yaml_map,
            None: self.__class__.construct_undefined
        }
        self._yaml_constructors.update(self._cls_yaml_constructors)
        self._yaml_multi_constructors = self._cls_yaml_multi_constructors.copy()
        super().__init__(*args, **kw)

    def construct_non_recursive_object(self, node, tag=None):
        # type: (Any, Optional[str]) -> Any
        constructor = None  # type: Any
        tag_suffix = None
        if tag is None:
            tag = node.tag
        if tag in self._yaml_constructors:
            constructor = self._yaml_constructors[tag]
        else:
            for tag_prefix in self._yaml_multi_constructors:
                if tag.startswith(tag_prefix):
                    tag_suffix = tag[len(tag_prefix) :]
                    constructor = self._yaml_multi_constructors[tag_prefix]
                    break
            else:
                if None in self._yaml_multi_constructors:
                    tag_suffix = tag
                    constructor = self._yaml_multi_constructors[None]
                elif None in self._yaml_constructors:
                    constructor = self._yaml_constructors[None]
                elif isinstance(node, ScalarNode):
                    constructor = self.__class__.construct_scalar
                elif isinstance(node, SequenceNode):
                    constructor = self.__class__.construct_sequence
                elif isinstance(node, MappingNode):
                    constructor = self.__class__.construct_mapping
        if tag_suffix is None:
            data = constructor(self, node)
        else:
            data = constructor(self, tag_suffix, node)
        if isinstance(data, types.GeneratorType):
            generator = data
            data = next(generator)
            if self.deep_construct:
                for _dummy in generator:
                    pass
            else:
                self.state_generators.append(generator)
        return data

    def get_args(*args, **kw):
        if kw:
            raise NotImplementedError('can currently only handle positional arguments')
        if len(args) == 2:
            return MyConstructor, args[0], args[1]
        else:
            return args[0], args[1], args[2]

    def add_constructor(self, tag, constructor):
        self, tag, constructor = MyConstructor.get_args(*args, **kw)
        if inspect.isclass(self):
            self._cls_yaml_constructors[tag] = constructor
            return
        self._yaml_constructors[tag] = constructor

    def add_multi_constructor(*args, **kw): # self, tag_prefix, multi_constructor):
        self, tag_prefix, multi_constructor = MyConstructor.get_args(*args, **kw)
        if inspect.isclass(self):
            self._cls_yaml_multi_constructors[tag_prefix] = multi_constructor
            return
        self._yaml_multi_constructors[tag_prefix] = multi_constructor

def get_loader_org(parameter):
    def construct_node(constructor: ruamel.yaml.Constructor, tag: str, node: ruamel.yaml.Node):
        return parameter(tag.lstrip("!"), str(node.value))

    loader = ruamel.yaml.YAML()
    loader.Constructor = MyConstructor
    loader.constructor.add_multi_constructor("", construct_node)
    return loader

foo = get_loader_org(lambda tag, node: f"foo: {tag}, {node}")
bar = get_loader_org(lambda tag, node: f"bar: {tag}, {node}")
print('>org<', foo.load("!abc 123"), bar.load("!xyz 456"), sep="\n")


def get_loader_instance(parameter):
    def construct_node(constructor: ruamel.yaml.Constructor, tag: str, node: ruamel.yaml.Node):
        return parameter(tag.lstrip("!"), str(node.value))

    # Create a new class to prevent state sharing through class attributes.
    class ConstructorWrapper(MyConstructor):
        pass

    loader = ruamel.yaml.YAML()
    loader.Constructor = ConstructorWrapper
    loader.constructor.add_multi_constructor("", construct_node)
    return loader

foo = get_loader_instance(lambda tag, node: f"foo: {tag}, {node}")
bar = get_loader_instance(lambda tag, node: f"bar: {tag}, {node}")
print('>instance<', foo.load("!abc 123"), bar.load("!xyz 456"), sep="\n")


def get_loader_cls(parameter):
    def construct_node(constructor: ruamel.yaml.Constructor, tag: str, node: ruamel.yaml.Node):
        return parameter(tag.lstrip("!"), str(node.value))

    # Create a new class to prevent state sharing through class attributes.
    class ConstructorWrapper(MyConstructor):
        pass

    loader = ruamel.yaml.YAML()
    loader.Constructor = ConstructorWrapper
    loader.Constructor.add_multi_constructor("", construct_node)
    #      ^ using the virtual class method
    return loader

foo = get_loader_cls(lambda tag, node: f"foo: {tag}, {node}")
bar = get_loader_cls(lambda tag, node: f"bar: {tag}, {node}")
print('>cls<', foo.load("!abc 123"), bar.load("!xyz 456"), sep="\n")

这使:

>org<
foo: abc, 123
bar: xyz, 456
>instance<
foo: abc, 123
bar: xyz, 456
>cls<
bar: abc, 123
bar: xyz, 456

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM