YAML stands for YAML Aint’ Markup Language. It is widely used to write and store configuration files for many different DevOps tools and applications. It is written as a text file, which can easily be read by humans and is simple to read and understand. It uses .yaml or .yml as extensions. It’s similar to other data serialization languages like JSON, and XML.
Data serialization is the standard format to perform config file transfers and recovery over the network.
A data serialization file written in Python using YAML can be easily sent over the network and then it can be de-serialized using another programming language for usage. It supports multiple languages like Python, JavaScript, Java, and many more. These languages have YAML libraries that enable them to parse and use a YAML file. In this article, we will use Python to review YAML with some examples. It also supports data structures like lists, dictionaries, and arrays.
YAML vs XML vs JSON
Let’s see an example for a config file to see the three versions to get an overview and the syntax.
YAML – YAML Ain’t A Markup Language
configuration:
name: my-api-config
version: 1.0.0
about: "some description"
description: |
Lorem ipsum dolor sit amet,
consectetur adipiscing elit
scripts:
"start:dev": main.py
keywords: ["hello", "there"]
XML – Extensible Markup Language
<configuration>
<name>my-api-config</name>
<version>1.0.0</version>
<about>some description</about>
<description>Lorem ipsum dolor sit amet, consectetur adipiscing elit</description>
<scripts>
<start:dev>main.py</start:dev>
</scripts>
<keywords>hello</keywords>
<keywords>there</keywords>
</configuration>
JSON – JavaScript Object Notation
{
"configuration": {
"name": "my-api-config",
"version": "1.0.0",
"about": "some description",
"description": "Lorem ipsum dolor sit amet, \nconsectetur adipiscing elit\n",
"scripts": {
"start:dev": "main.py"
},
"keywords": [
"hello",
"there"
]
}
}
Breaking down a YAML File
The YAML file in our example shows some configuration settings for an application. There are some clear differences when compared with the other two config file formats, and they even use a lot more symbol parsing. YAML at its core uses key: value pairs to store data in the file. The keys are supposed to be strings and they can be written with or without quotes as well. The values can take in multiple data types like integers, strings, lists, booleans, and so on.
Proper indentations are expected while writing a YAML file. Using tab spaces is not allowed so we need to be careful else we will be having linting errors in our file. So, it’s really simple and human-readable. We don’t have to parse through multiple symbols while reading it unlike an XML or a JSON file.
configuration:
name: my-api-config
version: 1.0.0
about: "some description"
# This is a comment
description: |
Lorem ipsum dolor sit amet,
consectetur adipiscing elit
scripts:
"start:dev": main.py
keywords: ["hello", "there"]
We can also include comments in our file as written above as well as multi-line strings using the |
pipe character as shown in the example code.
YAML File Processing with PyYaml
In this section, we are going to perform some basic operations with the YAML file like reading, writing, and modifying data using PyYaml Module for Python.
- Installing
PyYaml
pip install pyyaml
Reading a yaml
file
Let’s say we have a yaml file with some configuration and we want to read the contents using Python.
Filename: config_one.yml
configuration:
name: my-api-config
version: 1.0.0
about: some description
stack:
- python
- django
Next, we will create a new python file and try to read the yml file.
Filename: PyYaml.py
import yaml
with open("config_one.yml", "r") as first_file:
data = yaml.safe_load(first_file)
print(type(data))
print(data)
"""
Output:
<class 'dict'>
{'configuration': {'name': 'my-api-config', 'version': '1.0.0', 'about': 'some description', 'stack': ['python', 'django']}}
"""
Explanation:
We are importing the pyyaml
module using import yaml
. To read a yaml file, we first have to open the file in read mode and then load the contents using safe_load()
. There are multiple loaders because of different constructors like the load()
function. Using load()
is not secure as it allows the execution of almost any script including malicious code, which is not at all safe. Thus, safe_load()
is the recommended way and it will not create any arbitrary objects.
We are printing out the type of data in our yaml file using Python code. The console shows the output as <class dict>
and the data contained is formatted as a dictionary, stored as key: value
pairs.
Modifying our yaml
file
To modify the file that we have loaded, we must first identify the data type. If the value for the key is a string, we must put all the additional values in a list before we can update the key: value
pair.
import yaml
with open("config_one.yml", "r") as first_file:
data = yaml.safe_load(first_file)
print(type(data))
# Accessing our <class dict> and modifying value data using a key
data["configuration"]["stack"] = ["flask", "sql"]
# Appending data to the list
data["configuration"]["stack"].append("pillow")
print(data)
"""
Output:
<class 'dict'>
{'configuration': {'name': 'my-api-config', 'version': '1.0.0', 'about': 'some description', 'stack': ['flask', 'sql', 'pillow']}}
"""
Explanation:
We have a nested dictionary here, and we are accessing the data using the key whose values we are trying to modify. There is also the append()
function which adds another item to the list of values. Note that these modifications are performed at runtime only. We will write these values to our new yaml file.
Writing a yaml
file with the modified data
The above data along with the modified values can be written in a new file with just a few lines of code.
import yaml
with open("config_one.yml", "r") as first_file:
data = yaml.safe_load(first_file)
print(type(data))
# Accessing our <class dict> and modifying value data using a key
data["configuration"]["stack"] = ["flask", "sql"]
# Appending data to the list
data["configuration"]["stack"].append("pillow")
print(data)
# Writing a new yaml file with the modifications
with open("new_config.yaml", "w") as new_file:
yaml.dump(data, new_file)
Filename: new_config.yaml
configuration:
about: some description
name: my-api-config
stack:
- flask
- sql
- pillow
version: 1.0.0
Explanation:
We will have to provide the new file name with the below syntax and then use yaml.dump
with 2 params, the data variable containing the original yaml code along with the changes made to it, and the second param as the new_file
variable declared for executing the write method. We can see that the new file retained the code from the original file along with the changes that we applied to it.
Summary
In this article, we went through the fundamental structure of a yaml file and used it for reading, modifying, and writing the configuration to a new file. We also compared it with JSON and XML using different syntax for the same YAML file. The minimalistic approach used to write a YAML file is clearly very simple and human-readable which makes it one of the most popular text format configuration files used by a wide variety of technology stacks.