简体   繁体   English

在python中读取一个YAML文件并通过匹配键值对来访问数据

[英]Reading a YAML file in python and accessing the data by matching key value pair

I am developing a software using Python, where I need to read a YAML file with multiple levels as shown below:我正在使用 Python 开发一个软件,我需要在其中读取具有多个级别的 YAML 文件,如下所示:

#Filename: SampleCase.yml
%YAML 1.1
VesselTypes:
  - Name: Escort Tug
    Length: 32
    Breadth: 12.8
    Depth: 9
    Draughts:
    - Name: Draught1
      Mass: 500
      CentreOfGravity: [16.497, 0, 4.32]
    TowingStaples:
    - Name: Staple1
      Position: [0, 0, 0]
    Thrusters:
    - Name: Port Propeller
      Position: [0, -1, 0]
      MaxRPM: 1800
      MaxPower: 2525
    - Name: Stbd Propeller
      Position: [0, 1, 0]
      MaxRPM: 1800
      MaxPower: 2525
  - Name: Ship    
Vessels:
  - Name: Tug
    VesselType: Escort Tug
    Draught: Draught1
    InitialPosition: [0, 0, 0]
    Orientation: [0, 0, 0]
  - Name: Tanker
    VesselType: Ship
    Draught: Draught1
    InitialPosition: [0, 0, 0]
    Orientation: [0, 0, 0]
    Speed: 8  

Here, there are two vessels named Tug and Tanker.在这里,有两艘名为 Tug 和 Tanker 的船只。 They are of two vessel types, "Escort Tug" and "Ship".它们有两种船舶类型,“护航拖船”和“船舶”。

#Filename: main.py
import yaml
# Reading YAML data
file_name = 'SampleCase.yml'
with open(file_name, 'r') as f:
    data = yaml.load(f)

print(data["Vessels"][0]["Name"])

I am able to access the stored data using index numbers (eg data["Vessels"][0]["Name"] , but I would like to access them using the matching key. For example, I want to print the MaxRPM value of the Port Propeller of the vessel named "Tug". What is the standard way of doing this in python?我可以使用索引号访问存储的数据(例如data["Vessels"][0]["Name"] ,但我想使用匹配的键访问它们。例如,我想打印 MaxRPM 值名为“Tug”的船只的港口螺旋桨。在python中这样做的标准方法是什么?

There is not a standard way of doing this, and this is for a large part caused by the fact that the keys of YAML can be complex.没有标准的方法可以做到这一点,这在很大程度上是由于 YAML 的密钥可能很复杂。 This makes path matching methods that work for much simpler formats like JSON unusable.这使得适用于更简单格式(如 JSON)的路径匹配方法无法使用。

If your YAML is "tag-less", like yours, it still allows much more complex structures than JSON, but you can implement walking recursively over the collection types of a YAML file (sequence and mapping) fairly easily, and while doing so explicitly match indices resp.如果您的 YAML 像您一样是“无标签”的,它仍然允许比 JSON 更复杂的结构,但是您可以相当轻松地在 YAML 文件的集合类型(序列和映射)上实现递归遍历,同时显式执行此操作匹配索引。 keys and/or elements resp.键和/或元素。 values:价值观:

import ruamel.yaml as yaml

def _do_not_care():
    pass

def find_collection(d, key=_do_not_care, value=_do_not_care, results=None):

    def check_key_value(d, k, v, results):
        # print('checking', key, value, k, d[k], results)
        if k == key:
            if value in [_do_not_care, v]:
                results.append(d)
                return
        elif key == _do_not_care and v == value:
            results.append(d)
            return
        if isinstance(v, (dict, list)):
            find_collection(v, key, value, results)

    if results is None:
        results = []
    if isinstance(d, dict):
        for k in d:
            check_key_value(d, k, d[k], results)
    if isinstance(d, list):
        for k, v in enumerate(d):
            check_key_value(d, k, v, results)
    return results

def find_first(d, key=_do_not_care, value=_do_not_care):
    ret_val = find_collection(d, key, value)
    return ret_val[0] if ret_val else {}

def find_value_for_key(d, key):
    return find_first(d, key)[key]

with the above in place you can do:有了上述内容,您可以执行以下操作:

file_name = 'SampleCase.yml'
with open(file_name, 'r') as f:  
    data = yaml.safe_load(f)
for d in find_collection(data, value='Tug'):
    vessel_type = find_first(data, key='Name', value=d['VesselType'])
    port_propeller = find_first(vessel_type, key='Name', value='Port Propeller')
    print('Tug -> MaxRPM', find_value_for_key(port_propeller, key='MaxRPM'))

this prints (assuming the input is corrected, see point 1. ):这会打印(假设输入已更正,请参阅第 1 点。):

Tug -> MaxRPM 1800

There are a few things to keep in mind:有几件事情要记住:

  1. Your YAML is invalid, as there is no --- separation between the directive and the document.您的 YAML 无效,因为指令和文档之间没有---分隔。 It first three lines should look like:前三行应如下所示:

     %YAML 1.1 --- VesselTypes:

    However it is probably not necessary to specify the directive at all: PyYAML still doesn't support YAML 1.2 after seven years and your YAML doesn't seem to have anything YAML 1.1 specific.但是,可能根本没有必要指定指令:七年后 PyYAML 仍然不支持 YAML 1.2,而且您的 YAML 似乎没有任何特定于 YAML 1.1 的内容。

  2. You are using PyYAML's load() without Loader argument, which can be unsafe if you have no control over the input.您正在使用 PyYAML 的load()而不使用Loader参数,如果您无法控制输入,这可能是不安全的。 You should always use safe_load if you can (like with your source).如果可以,您应该始终使用safe_load (就像您的源代码一样)。

The above was tested using ruamel.yaml (a superset of PyYAML supporting YAML 1.2 as well as 1.1. Disclaimer: I am the author of that package).以上是使用ruamel.yaml (支持 YAML 1.2 和 1.1 的 PyYAML 的超集。免责声明:我是该包的作者)。 I should work as is with PyYAML if you have to stick with that.如果你必须坚持,我应该像 PyYAML 一样工作。

Turn your list into a dict in which the keys are the names:把你的list变成一个dict ,其中的键是名字:

result = {}
for elem in data['Vessels']:
    name = elem.pop('Name')
    result[name] = elem

data['Vessels'] = result

print(data['Tug'])
>> {'VesselType': 'EscortTug ...}

You can pass the YAML output to function, which constructs a dictionary based on your specific searching requirements.您可以将 YAML 输出传递给函数,该函数会根据您的特定搜索需求构建字典。 The behaviour you describe sounds ad-hoc, I don't think there is anything built-in to use.您描述的行为听起来是临时的,我认为没有任何内置功能可以使用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM