如何在 Python 中解析 YAML 文件?
The easiest and purest method without relying on C headers is PyYaml ( documentation ), which can be installed via pip install pyyaml
:
#!/usr/bin/env python
import yaml
with open("example.yaml", 'r') as stream:
try:
print(yaml.safe_load(stream))
except yaml.YAMLError as exc:
print(exc)
And that's it. A plain yaml.load()
function also exists, but yaml.safe_load()
should always be preferred unless you explicitly need the arbitrary object serialization/deserialization provided in order to avoid introducing the possibility for arbitrary code execution.
Note the PyYaml project supports versions up through the YAML 1.1 specification . If YAML 1.2 specification support is needed, see ruamel.yaml as noted in this answer .
# -*- coding: utf-8 -*-
import yaml
import io
# Define data
data = {
'a list': [
1,
42,
3.141,
1337,
'help',
u'€'
],
'a string': 'bla',
'another dict': {
'foo': 'bar',
'key': 'value',
'the answer': 42
}
}
# Write YAML file
with io.open('data.yaml', 'w', encoding='utf8') as outfile:
yaml.dump(data, outfile, default_flow_style=False, allow_unicode=True)
# Read YAML file
with open("data.yaml", 'r') as stream:
data_loaded = yaml.safe_load(stream)
print(data == data_loaded)
a list:
- 1
- 42
- 3.141
- 1337
- help
- €
a string: bla
another dict:
foo: bar
key: value
the answer: 42
.yml
and .yaml
For your application, the following might be important:
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python
If you have YAML that conforms to the YAML 1.2 specification (released 2009) then you should use ruamel.yaml (disclaimer: I am the author of that package). It is essentially a superset of PyYAML, which supports most of YAML 1.1 (from 2005).
If you want to be able to preserve your comments when round-tripping, you certainly should use ruamel.yaml.
Upgrading @Jon's example is easy:
import ruamel.yaml as yaml
with open("example.yaml") as stream:
try:
print(yaml.safe_load(stream))
except yaml.YAMLError as exc:
print(exc)
Use safe_load()
unless you really have full control over the input, need it (seldom the case) and know what you are doing.
If you are using pathlib Path
for manipulating files, you are better of using the new API ruamel.yaml provides:
from ruamel.yaml import YAML
from pathlib import Path
path = Path('example.yaml')
yaml = YAML(typ='safe')
data = yaml.load(path)
First install pyyaml using pip3.
Then import yaml module and load the file into a dictionary called 'my_dict':
import yaml
with open('filename.yaml') as f:
my_dict = yaml.safe_load(f)
That's all you need. Now the entire yaml file is in 'my_dict' dictionary.
Example:
defaults.yaml
url: https://www.google.com
environment.py
from ruamel import yaml
data = yaml.safe_load(open('defaults.yaml'))
data['url']
To access any element of a list in a YAML file like this:
global:
registry:
url: dtr-:5000/
repoPath:
dbConnectionString: jdbc:oracle:thin:@x.x.x.x:1521:abcd
You can use following python script:
import yaml
with open("/some/path/to/yaml.file", 'r') as f:
valuesYaml = yaml.load(f, Loader=yaml.FullLoader)
print(valuesYaml['global']['dbConnectionString'])
I use ruamel.yaml . Details & debatehere .
from ruamel import yaml
with open(filename, 'r') as fp:
read_data = yaml.load(fp)
Usage of ruamel.yaml is compatible (with some simple solvable problems) with old usages of PyYAML and as it is stated in link I provided, use
from ruamel import yaml
instead of
import yaml
and it will fix most of your problems.
EDIT : PyYAML is not dead as it turns out, it's just maintained in a different place.
#!/usr/bin/env python
import sys
import yaml
def main(argv):
with open(argv[0]) as stream:
try:
#print(yaml.load(stream))
return 0
except yaml.YAMLError as exc:
print(exc)
return 1
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
read_yaml_file function returning all data into dictionary.
def read_yaml_file(full_path=None, relative_path=None):
if relative_path is not None:
resource_file_location_local = ProjectPaths.get_project_root_path() + relative_path
else:
resource_file_location_local = full_path
with open(resource_file_location_local, 'r') as stream:
try:
file_artifacts = yaml.safe_load(stream)
except yaml.YAMLError as exc:
print(exc)
return dict(file_artifacts.items())
Considering the above mentioned answers, all of which are good, there is a Python package available to smartly construct objects from YAML/JSON/dicts, and is actively being developed and expanded. ( full disclosure, I am a co-author of this package , see here )
Install:
pip install pickle-rick
Use:
Define a YAML or JSON string (or file).
BASIC:
text: test
dictionary:
one: 1
two: 2
number: 2
list:
- one
- two
- four
- name: John
age: 20
USERNAME:
type: env
load: USERNAME
callable_lambda:
type: lambda
load: "lambda: print('hell world!')"
datenow:
type: lambda
import:
- "from datetime import datetime as dd"
load: "lambda: print(dd.utcnow().strftime('%Y-%m-%d'))"
test_function:
type: function
name: test_function
args:
x: 7
y: null
s: hello world
any:
- 1
- hello
import:
- "math"
load: >
def test(x, y, s, any):
print(math.e)
iii = 111
print(iii)
print(x,s)
if y:
print(type(y))
else:
print(y)
for i in any:
print(i)
Then use it as an object.
>> from pickle_rick import PickleRick
>> config = PickleRick('./config.yaml', deep=True, load_lambda=True)
>> config.BASIC.dictionary
{'one' : 1, 'two' : 2}
>> config.BASIC.callable_lambda()
hell world!
You can define Python functions, load additional data from other files or REST APIs, environmental variables, and then write everything out to YAML or JSON again.
This works especially well when building systems that require structured configuration files, or in notebooks as interactive structures.
There is a security note to using this. Only load files that are trusted, as any code can be executed, thus stay clear of just loading anything without knowing what the complete contents are.
The package is called PickleRick and is available here:
I'm Not sure how it wasn't suggested before, but I would highly recommend using yq which is a jq wrapper for YAML.
yq uses jq like syntax but works with yaml files as well as json.
1 ) Read a value:
yq e '.a.b[0].c' file.yaml
2 ) Pipe from STDIN:
cat file.yaml | yq e '.a.b[0].c' -
3 ) Update a yaml file, inplace
yq e -i '.a.b[0].c = "cool"' file.yaml
4 ) Update using environment variables:
NAME=mike yq e -i '.a.b[0].c = strenv(NAME)' file.yaml
5 ) Merge multiple files:
yq ea '. as $item ireduce ({}; . * $item )' path/to/*.yml
6 ) Multiple updates to a yaml file:
yq e -i '
.a.b[0].c = "cool" |
.x.y.z = "foobar" |
.person.name = strenv(NAME)
' file.yaml
(*) Read more on how to parse fields from yaml with based on jq filters .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.