简体   繁体   中英

Dumping data from YAML file using Python

I have a YAML file like the following:

- workload:
    name: cloud1
    param:
      p1: v1
      p2: v2

- workload:
    name: cloud2
    param:
      p1: v1
      p2: v2

I can parse the file using the following Python script:

#!/usr/bin/env python

import yaml   

try:
 for key, value in yaml.load(open('workload.yaml'))['workload'].iteritems():
   print key, value
except yaml.YAMLError as out:
  print(out)

output:

name cloud1
param {'p1': 'v1'}

But I'm looking for is something like:

workload1 = cloud1
workload1_param_p1 = v1
workload1_param_p2 = v2

workload2 = cloud2
workload2_param_p1 = v1
workload2_param_p2 = v2

Your output doesn't match your input as the toplevel of your YAML file is a sequence that maps to a Python list .
The other thing not entirely clear is where the workload and especially the 1 in workload1 come from. In the following I have assumed they come from the key of the mapping that constitutes the sequence elements resp. the position of that sequence element (starting at 1, hence the idx+1 ). The name is popped from a copy of the values, so that the rest can be recursively dumped correctly:

import sys
import ruamel.yaml

yaml_str = """\
- workload:
    name: cloud1
    param:
      p1: v1
      p2: v2

- workload:
    name: cloud2
    param:
      p1: v1
      p2: v2
"""

data = ruamel.yaml.round_trip_load(yaml_str)

def dump(prefix, d, out):
    if isinstance(d, dict):
        for k in d:
            dump(prefix[:] + [k], d[k], out)
    else:
        print('_'.join(prefix), '=', d, file=out)

for idx, workload in enumerate(data):
    for workload_key in workload:
        values = workload[workload_key].copy()
        # alternatively extract from values['name']
        workload_name = '{}{}'.format(workload_key, idx+1)
        print(workload_name, '=', values.pop('name'))
        dump([workload_name], values, sys.stdout)
    print()

gives:

workload1 = cloud1
workload1_param_p1 = v1
workload1_param_p2 = v2

workload2 = cloud2
workload2_param_p1 = v1
workload2_param_p2 = v2

This was done using ruamel.yaml , a YAML 1.2 parser, of which I am the author. If you only have YAML 1.1 code (as supported by PyYAML) you should still use ruamel.yaml as its round_trip_loader guarantees that your workload_param_p1 is printed before workload_param_p2 (with PyYAML that is not guaranteed).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM