简体   繁体   中英

Python ruamel.yaml dumps tags with quotes

I'm trying to use ruamel.yaml to modify an AWS CloudFormation template on the fly using python. I added the following code to make the safe_load working with CloudFormation functions such as !Ref . However, when I dump them out, those values with !Ref (or any other functions) will be wrapped by quotes. CloudFormation is not able to identify that.

See example below:

import sys, json, io, boto3
import ruamel.yaml

def funcparse(loader, node):
  node.value = {
      ruamel.yaml.ScalarNode:   loader.construct_scalar,
      ruamel.yaml.SequenceNode: loader.construct_sequence,
      ruamel.yaml.MappingNode:  loader.construct_mapping,
  }[type(node)](node)
  node.tag = node.tag.replace(u'!Ref', 'Ref').replace(u'!', u'Fn::')
  return dict([ (node.tag, node.value) ])

funcnames = [ 'Ref', 'Base64', 'FindInMap', 'GetAtt', 'GetAZs', 'ImportValue',
              'Join', 'Select', 'Split', 'Split', 'Sub', 'And', 'Equals', 'If',
              'Not', 'Or' ]

for func in funcnames:
    ruamel.yaml.SafeLoader.add_constructor(u'!' + func, funcparse)

txt = open("/space/tmp/a.template","r")
base = ruamel.yaml.safe_load(txt)
base["foo"] = {
    "name": "abc",
    "Resources": {
        "RouteTableId" : "!Ref aaa",
        "VpcPeeringConnectionId" : "!Ref bbb",
        "yourname": "dfw"
    }
}

ruamel.yaml.safe_dump(
    base,
    sys.stdout,
    default_flow_style=False
)

The input file is like this:

foo:
  bar: !Ref barr
  aa: !Ref bb

The output is like this:

foo:
  Resources:
    RouteTableId: '!Ref aaa'
    VpcPeeringConnectionId: '!Ref bbb'
    yourname: dfw
  name: abc

Notice the '!Ref VpcRouteTable' is been wrapped by single quotes. This won't be identified by CloudFormation. Is there a way to configure dumper so that the output will be like:

foo:
  Resources:
    RouteTableId: !Ref aaa
    VpcPeeringConnectionId: !Ref bbb
    yourname: dfw
  name: abc

Other things I have tried:

  • pyyaml library, works the same
  • Use Ref:: instead of !Ref, works the same

Essentially you tweak the loader, to load tagged (scalar) objects as if they were mappings, with the tag the key and the value the scalar. But you don't do anything to distinguish the dict loaded from such a mapping from other dicts loaded from normal mappings, nor do you have any specific code to represent such a mapping to "get the tag back".

When you try to "create" a scalar with a tag, you just make a string starting with an exclamation mark, and that needs to get dumped quoted to distinguish it from real tagged nodes.

What obfuscates this all, is that your example overwrites the loaded data by assigning to base["foo"] so the only thing you can derive from the safe_load , and and all your code before that, is that it doesn't throw an exception. Ie if you leave out the lines starting with base["foo"] = { your output will look like:

foo:
  aa:
    Ref: bb
  bar:
    Ref: barr

And in that Ref: bb is not distinguishable from a normal dumped dict. If you want to explore this route, then you should make a subclass TagDict(dict) , and have funcparse return that subclass, and also add a representer for that subclass that re-creates the tag from the key and then dumps the value . Once that works (round-trip equals input), you can do:

     "RouteTableId" : TagDict('Ref', 'aaa')

If you do that, you should, apart from removing non-used libraries, also change your code to close the file-pointer txt in your code, as that can lead to problems. You can do this elegantly be using the with statement:

with open("/space/tmp/a.template","r") as txt:
    base = ruamel.yaml.safe_load(txt)

(I also would leave out the "r" (or put a space before it); and replace txt with a more appropriate variable name that indicates this is an (input) file pointer).

You also have the entry 'Split' twice in your funcnames , which is superfluous.


A more generic solution can be achieved by using a multi-constructor that matches any tag and having three basic types to cover scalars, mappings and sequences.

import sys
import ruamel.yaml

yaml_str = """\
foo:
  scalar: !Ref barr
  mapping: !Select
    a: !Ref 1
    b: !Base64 A413
  sequence: !Split
  - !Ref baz
  - !Split Multi word scalar
"""

class Generic:
    def __init__(self, tag, value, style=None):
        self._value = value
        self._tag = tag
        self._style = style


class GenericScalar(Generic):
    @classmethod
    def to_yaml(self, representer, node):
        return representer.represent_scalar(node._tag, node._value)

    @staticmethod
    def construct(constructor, node):
        return constructor.construct_scalar(node)


class GenericMapping(Generic):
    @classmethod
    def to_yaml(self, representer, node):
        return representer.represent_mapping(node._tag, node._value)

    @staticmethod
    def construct(constructor, node):
        return constructor.construct_mapping(node, deep=True)


class GenericSequence(Generic):
    @classmethod
    def to_yaml(self, representer, node):
        return representer.represent_sequence(node._tag, node._value)

    @staticmethod
    def construct(constructor, node):
        return constructor.construct_sequence(node, deep=True)


def default_constructor(constructor, tag_suffix, node):
    generic = {
        ruamel.yaml.ScalarNode: GenericScalar,
        ruamel.yaml.MappingNode: GenericMapping,
        ruamel.yaml.SequenceNode: GenericSequence,
    }.get(type(node))
    if generic is None:
        raise NotImplementedError('Node: ' + str(type(node)))
    style = getattr(node, 'style', None)
    instance = generic.__new__(generic)
    yield instance
    state = generic.construct(constructor, node)
    instance.__init__(tag_suffix, state, style=style)


ruamel.yaml.add_multi_constructor('', default_constructor, Loader=ruamel.yaml.SafeLoader)


yaml = ruamel.yaml.YAML(typ='safe', pure=True)
yaml.default_flow_style = False
yaml.register_class(GenericScalar)
yaml.register_class(GenericMapping)
yaml.register_class(GenericSequence)

base = yaml.load(yaml_str)
base['bar'] = {
    'name': 'abc',
    'Resources': {
        'RouteTableId' : GenericScalar('!Ref', 'aaa'),
        'VpcPeeringConnectionId' : GenericScalar('!Ref', 'bbb'),
        'yourname': 'dfw',
        's' : GenericSequence('!Split', ['a', GenericScalar('!Not', 'b'), 'c']),
    }
}
yaml.dump(base, sys.stdout)

which outputs:

bar:
  Resources:
    RouteTableId: !Ref aaa
    VpcPeeringConnectionId: !Ref bbb
    s: !Split
    - a
    - !Not b
    - c
    yourname: dfw
  name: abc
foo:
  mapping: !Select
    a: !Ref 1
    b: !Base64 A413
  scalar: !Ref barr
  sequence: !Split
  - !Ref baz
  - !Split Multi word scalar

Please note that sequences and mappings are handled correctly and that they can be created as well. There is however no check that:

  • the tag you provide is actually valid
  • the value associated with the tag is of the proper type for that tag name (scalar, mapping, sequence)
  • if you want GenericMapping to behave more like dict , then you probably want it a subclass of dict (and not of Generic ) and provide the appropriate __init__ (idem for GenericSequence / list )

When the assignment is changed to something more close to yours:

base["foo"] = {
    "name": "abc",
    "Resources": {
        "RouteTableId" : GenericScalar('!Ref', 'aaa'),
        "VpcPeeringConnectionId" : GenericScalar('!Ref', 'bbb'),
        "yourname": "dfw"
    }
}

the output is:

foo:
  Resources:
    RouteTableId: !Ref aaa
    VpcPeeringConnectionId: !Ref bbb
    yourname: dfw
  name: abc

which is exactly the output you want.

Apart from Anthon's detailed answer above, for the specific question in terms of CloudFormation template, I found another very quick & sweet workaround.

Still using the constructor snippet to load the YAML.

def funcparse(loader, node):
  node.value = {
      ruamel.yaml.ScalarNode:   loader.construct_scalar,
      ruamel.yaml.SequenceNode: loader.construct_sequence,
      ruamel.yaml.MappingNode:  loader.construct_mapping,
  }[type(node)](node)
  node.tag = node.tag.replace(u'!Ref', 'Ref').replace(u'!', u'Fn::')
  return dict([ (node.tag, node.value) ])

funcnames = [ 'Ref', 'Base64', 'FindInMap', 'GetAtt', 'GetAZs', 'ImportValue',
              'Join', 'Select', 'Split', 'Split', 'Sub', 'And', 'Equals', 'If',
              'Not', 'Or' ]

for func in funcnames:
    ruamel.yaml.SafeLoader.add_constructor(u'!' + func, funcparse)

When we manipulate the data, instead of doing

base["foo"] = {
    "name": "abc",
    "Resources": {
        "RouteTableId" : "!Ref aaa",
        "VpcPeeringConnectionId" : "!Ref bbb",
        "yourname": "dfw"
    }
}

which will wrap the value !Ref aaa with quotes, we can simply do:

base["foo"] = {
    "name": "abc",
    "Resources": {
        "RouteTableId" : {
            "Ref" : "aaa"
        },
        "VpcPeeringConnectionId" : {
            "Ref" : "bbb
         },
        "yourname": "dfw"
    }
}

Similarly, for other functions in CloudFormation, such as !GetAtt, we should use their long form Fn::GetAtt and use them as the key of a JSON object. Problem solved easily.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM