Ansible IV: Managing variables and facts
In the last post we created three machines using Vagrant and VirtualBox. Those machines correspond to the database
, backend
and frontend
in our system. In this fourth entry we will configure the database, taking a look at how variables and facts work.
Note: we are working both in PowerShell and in the WSL. Everytime you encounter an “bash” code box and you are in Windows, you should be running it inside the WSL. Check the first post of the series to install the WSL.
Into variables and facts
We already used variables at the top of our playbooks to define the ansible_user
and the ansible_private_key_file
:
- name: install database
hosts: database
vars:
ansible_user: vagrant
ansible_private_key_file: .vagrant/machines/database/virtualbox/private_key
Extracted from the documentation… “Ansible uses variables to help deal with differences between systems”.
For example, two configurations could have different version, so you reflect this inside a variable, and use that variable to build the proper tasks. Or a configuration that should be prepared to run inside CentOS and Debian, which will need to know how dependencies are called in the two OS.
There are many ways to use variables, where many means 22. Yeah, it’s insane. Bookmark this reference as it will help you to know where to put a variable. The two main locations where I use variables are: inside the groups_vars
folder, at the command line and inside the inventory file.
group_vars
directory
The group_vars
directory can be placed at two locations: next to the playbook or next to the inventory:
ansible-directory/
├── group_vars # Playbook group_vars
├── inventories
│ ├── group_vars # Inventory group vars
│ └── inventory.ini
└── playbook.yml
I usually create the group_vars
playbook folder, but with multiple inventories with different variables each it is a good idea to use the group_vars
at inventory level.
Inside the group_vars
folder you can create files with variables for each of the groups of the inventory. So, there will be always a group that is called all
, which contains all the hosts. So the variables defined in group_vars/all.yml
will be added to all the hosts. You have a group for the servers in Europe called eu
? If you want to apply concrete values to their variables, for example a mirror for downloading packages, create a group_vars/eu.yml
and they will load those variables.
Be careful, because the variables inside group_vars/all.yml
have preference over group_vars/eu.yml
(the documentation).
--extra-vars|-e
command option
The variables passed throught here will overrride any other variable specified anywere. I use them sparingly, I tend to use the other options. Sincerely, I do use them to pass connection options like ansible_user
and ansible_password
, but I won’t recommend it, there are better options like Ansible Vault.
Now back to our files.
Where we left last time…
The last post finished with our playbooks and files this way:
.
├── Vagrantfile
├── backend.yml
├── database.yml
├── frontend.yml
├── inventory
└── main.yml
We will be working only with the database.yml
this time. So let’s start by recreating the environment (which you probably destroyed):
vagrant up database
Then we should clean our ~/.ssh/known_hosts
as it will complain about the IPs having other hosts (WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!
):
ssh-keygen -f ~/.ssh/known_hosts -R 10.0.0.30
Another step on which we will generate the host keys to put in our ~/.ssh/known_hosts
to avoid host key confirmation:
$ ssh-keyscan -H 10.0.0.30 > ~/.ssh/known_hosts
# 10.0.0.30:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
# 10.0.0.30:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
# 10.0.0.30:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
Note: sometimes you have to run this command two times, do not ask me why.
And last but not least step that will correct the error about keys having too open permissions:
chmod 400 .vagrant/machines/*/virtualbox/private_key
Now we are ready to launch our playbooks! But only the database one this time:
$ ansible-playbook -i inventory database.yml
PLAY [configure database]
TASK [Gathering Facts]
ok: [10.0.0.30]
TASK [Hello!]
ok: [10.0.0.30] => {
"msg": "Hello from database!"
}
PLAY RECAP
10.0.0.30 : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Installing our database
Getting tasks from MongoDB documentation and translating to Ansible
So now that we have our system running, we can start by making changes to our playbook. I decided to go with MongoDB, because yes. First thing I normally do is to explore installation process, in our case it is here in the MongoDB documentation. There is the list of commands we should run to install MongoDB:
$ sudo apt-get install gnupg
$ wget -qO - https://www.mongodb.org/static/pgp/server-4.2.asc | sudo apt-key add -
$ echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
$ sudo apt-get update
$ sudo apt-get install -y mongodb-org
Now we have to “translate” those commands to ansible tasks. The Ansible Module Index contains all the modules, so you can search them fast on your browser. Until you know them, it is probably better to search on your search engine using “Ansible” and the command that you want to run, for example “Ansible apt-get” and it will point you to the module that do that operation. The first one apt-get
corresponds to the Ansible apt
module.
Once we have done that process of all the commands, the correspondence between commands and modules is the following:
install gnupg - apt module
add mongodb asc key - apt_key module
add deb repository to apt sources - apt_repository module
update package lists - apt module
install mongodb-org - apt module
And then, those converted to tasks inside a playbook will look like this:
---
- name: install MongoDB
hosts: database
vars:
ansible_user: vagrant
ansible_private_key_file: .vagrant/machines/database/virtualbox/private_key
tasks:
- name: install gnupg
become: yes
apt:
name: gnupg
update_cache: yes
- name: add apt key
become: yes
apt_key:
url: https://www.mongodb.org/static/pgp/server-4.2.asc
- name: add apt repository
become: yes
apt_repository:
repo: deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse
filename: mongodb-org-4.2.list
- name: install
become: yes
apt:
name: mongodb-org
Starting and configuring the database
Those steps do not start the database process, so we should it ourselves. In the MongoDB documentation they use the systemctl
command, which corresponds with the service
module. Add the following task at the bottom of the playbook:
- name: start service
service:
name: mongod
state: started
Run the playbook again and check that the database is working. To do so you can check if the MongoDB port is opened with Netcat via the nc
command. It is not a proof that the database is running OK, but at least that port is listening to requests:
nc -v -z 10.0.0.30 27017
Sorry, this should fail with ... failed: Connection refused
. Do not worry, we should configure the database to accept incoming requests from origins other than localhost
. However, you can enter in the virtual machine, run again the same command but against localhost (which should be ):
$ vagrant ssh database
$ nc -v -z localhost 27017
Connection to localhost 27017 port [tcp/*] succeeded!
$ exit
MongoDB by default is binding to 127.0.0.1
, the IP for localhost
, so only accept reqeusts from localhost
. Since our first test came from outside the virtual machine, it failed. This behaviour can be configured in the /etc/mongod.conf
configuration file. There should be the follow YAML block:
# [...other content...]
net:
port: 27017
bindIp: 127.0.0.1
# [...other content...]
We will replace that 127.0.0.1
with Ansible using the replace
module. Add the following task before the start service
task:
# [...previous tasks...]
- name: configure bindIp
become: yes
replace:
path: /etc/mongod.conf
regexp: 127.0.0.1
replace: 0.0.0.0
This will replace the string 127.0.0.1
, the IP for localhost
, for the 0.0.0.0
, the IP for all IPs, so the service will accept connections from all origins.
If you run again the playbook and check if the port is open from outside the virtual machine it will fail again. You should restart the mongod service. For now, do it from inside the virtual machine:
$ vagrant ssh database
$ sudo systemctl restart mongod
$ exit
$ nc -z -v 10.0.0.30 27017
Connection to 10.0.0.30 27017 port [tcp/*] succeeded!
Great! You now have a fully functional database. Now let’s extract variables from this playbook to a group_vars/all.yml
file (at playbook level).
Extracting variables
The first thing that comes to my mind is the MongoDB version, which will be easy to extract. Create a groups_vars/all.yaml
file at playbook level and add there the mongodb_version
variable:
# group_vars/all.yml
---
mongodb_version: 4.2
With that, we can go to our playbook and replace the 4.2
for that variable. Here I only listed the tasks that should change, not the full playbook:
---
# [...previous content...]
- name: add apt key
become: yes
apt_key:
url: 'https://www.mongodb.org/static/pgp/server-{{ mongodb_version }}.asc'
- name: add apt repository
become: yes
apt_repository:
repo: 'deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/{{ mongodb_version }} multiverse'
filename: 'mongodb-org-{{ mongodb_version }}.list'
# [...following content...]
You should use '{{ mongodb_version }}'
to replace the value inside the strings. That syntax is coming from the Jinja2 templating engine which Ansible uses under the hood. Note that when we refer to a variable inside a string, like we do in our example, you should quote the full string (we added single quotes '
in all the strings that use the variable).
A fun part of the Ansible variables is that you can use a variable inside another variable. So we could extract even more variables in our example:
# group_vars/all.yml
---
mongodb_version: 4.2
mongodb_apt_key_url: 'https://www.mongodb.org/static/pgp/server-{{ mongodb_version }}.asc'
mongodb_apt_repository_repo: 'deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/{{ mongodb_version }} multiverse'
mongodb_apt_repository_filename: 'mongodb-org-{{ mongodb_version }}.list'
And in the playbook will be like:
---
# [...previous content...]
- name: add apt key
become: yes
apt_key:
url: '{{ mongodb_apt_key_url }}'
- name: add apt repository
become: yes
apt_repository:
repo: '{{ mongodb_apt_repository_repo }}'
filename: '{{ mongodb_apt_repository_filename }}'
# [...following content...]
Maybe this example is not that great, maybe we are moving too much to variables. However you are the one to tell that, so be careful about extracting too much (or too less) into variables.
Now let’s move into Facts!
Facts
Ansible call facts to information about something. There are multiple modules that gather those facts and populate some variables under the hood. For example ec2_metadata_facts which “gathers facts (instance metadata) about remote hosts within ec2”, or [listen_ports_facts] which “gather facts on processes listening on TCP and UDP ports”. If you search the list of all modules you will find a lot of them.
Here I want to talk about the setup
module and the gather_facts
playbook “entry”. By default Ansible knows certain things about the hosts it should configure. However, we can make this explicit (because it is enabled by default), or forcing it to stop doing it. This is controlled at the playbook level by the gather_facts
entry:
# random-playbook.yml
---
- name: my random playbook
hosts: all
gather_facts: no # yes by default
vars:
...
tasks:
...
But what information do Ansible gather? We can check it using the setup
module via command line. For our case:
ansible database -i inventory -m setup -e "ansible_user=vagrant ansible_private_key_file=.vagrant/machines/database/virtualbox/private_key" > setup_information.json
After that, I have a 766 lines file full of variables to be explored. I selected a few that are interesting:
{
"ansible_architecture": "x86_64",
"ansible_distribution": "Ubuntu",
"ansible_distribution_release": "bionic",
"ansible_distribution_version": "18.04",
"ansible_lsb": {
"codename": "bionic",
"description": "Ubuntu 18.04.4 LTS",
"id": "Ubuntu",
"major_release": "18",
"release": "18.04"
},
"ansible_os_family": "Debian",
"ansible_processor_cores": 2,
"ansible_processor_count": 1,
"ansible_processor_threads_per_core": 1,
"ansible_processor_vcpus": 2,
"ansible_python": {
"executable": "/usr/bin/python3",
"has_sslcontext": true,
"type": "cpython",
"version": {
"major": 3,
"micro": 9,
"minor": 6,
"releaselevel": "final",
"serial": 0
},
"version_info": [
3,
6,
9,
"final",
0
]
},
}
And many many more. I really like the ones that give information about processor architecture and the distribution. I use them a lot. We can use them in our example too:
# group_vars/all.yml
---
mongodb_version: 4.2
mongodb_apt_key_url: 'https://www.mongodb.org/static/pgp/server-{{ mongodb_version }}.asc'
mongodb_apt_repository_repo: 'deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu {{ ansible_distribution_release }}/mongodb-org/{{ mongodb_version }} multiverse'
mongodb_apt_repository_filename: 'mongodb-org-{{ mongodb_version }}.list'
Here we used ansible_distribution_release
for the mongodb_apt_repository_repo
.
Facts are really really helpful, in this case if you plan to run playbooks in different releases of Ubuntu you do not have to worry about changing the repo URL every time.
Closing
This posts took longer than I expected and it was longer than I expected too. I procrastinated a lot while doing it, so it took months to complete. I hope that this will be useful. I tried to balance direct information with chunks explaining so it is useful for those who have a little experience with Ansible and those who do not have it. All (actually “most”) of the feedback is welcome! Leave some comments if you have any trouble with the content!
Continue reading
- Ansible guide to variables
- Ansible special variables
- Jinja2 Template Designer Documentation
- James Kiarie' guide to Ansible variables and facts
It was useful? Done something similar? Have feedback?