Ansible IV: Managing variables and facts

#it automation #configuration as code

In the last post we created three machines using Vagrant and VirtualBox. Those machines correspond to the database, backend and frontend in our system. In this fourth entry we will configure the database, taking a look at how variables and facts work.

Note: we are working both in PowerShell and in the WSL. Everytime you encounter an “bash” code box and you are in Windows, you should be running it inside the WSL. Check the first post of the series to install the WSL.

Into variables and facts

We already used variables at the top of our playbooks to define the ansible_user and the ansible_private_key_file:

- name: install database
  hosts: database

  vars:
    ansible_user: vagrant
    ansible_private_key_file: .vagrant/machines/database/virtualbox/private_key

Extracted from the documentation… “Ansible uses variables to help deal with differences between systems”.

For example, two configurations could have different version, so you reflect this inside a variable, and use that variable to build the proper tasks. Or a configuration that should be prepared to run inside CentOS and Debian, which will need to know how dependencies are called in the two OS.

There are many ways to use variables, where many means 22. Yeah, it’s insane. Bookmark this reference as it will help you to know where to put a variable. The two main locations where I use variables are: inside the groups_vars folder, at the command line and inside the inventory file.

group_vars directory

The group_vars directory can be placed at two locations: next to the playbook or next to the inventory:

ansible-directory/
├── group_vars # Playbook group_vars
├── inventories
│   ├── group_vars # Inventory group vars
│   └── inventory.ini
└── playbook.yml

I usually create the group_vars playbook folder, but with multiple inventories with different variables each it is a good idea to use the group_vars at inventory level.

Inside the group_vars folder you can create files with variables for each of the groups of the inventory. So, there will be always a group that is called all, which contains all the hosts. So the variables defined in group_vars/all.yml will be added to all the hosts. You have a group for the servers in Europe called eu? If you want to apply concrete values to their variables, for example a mirror for downloading packages, create a group_vars/eu.yml and they will load those variables.

Be careful, because the variables inside group_vars/all.yml have preference over group_vars/eu.yml (the documentation).

--extra-vars|-e command option

The variables passed throught here will overrride any other variable specified anywere. I use them sparingly, I tend to use the other options. Sincerely, I do use them to pass connection options like ansible_user and ansible_password, but I won’t recommend it, there are better options like Ansible Vault.

Now back to our files.

Where we left last time…

The last post finished with our playbooks and files this way:

.
├── Vagrantfile 
├── backend.yml 
├── database.yml
├── frontend.yml
├── inventory
└── main.yml

We will be working only with the database.yml this time. So let’s start by recreating the environment (which you probably destroyed):

vagrant up database

Then we should clean our ~/.ssh/known_hosts as it will complain about the IPs having other hosts (WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!):

ssh-keygen -f ~/.ssh/known_hosts -R 10.0.0.30

Another step on which we will generate the host keys to put in our ~/.ssh/known_hosts to avoid host key confirmation:

$ ssh-keyscan -H 10.0.0.30 > ~/.ssh/known_hosts
# 10.0.0.30:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
# 10.0.0.30:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
# 10.0.0.30:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3

Note: sometimes you have to run this command two times, do not ask me why.

And last but not least step that will correct the error about keys having too open permissions:

chmod 400 .vagrant/machines/*/virtualbox/private_key

Now we are ready to launch our playbooks! But only the database one this time:

$ ansible-playbook -i inventory database.yml
PLAY [configure database]

TASK [Gathering Facts]
ok: [10.0.0.30]

TASK [Hello!]
ok: [10.0.0.30] => {
    "msg": "Hello from database!"
}

PLAY RECAP
10.0.0.30                  : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Installing our database

Getting tasks from MongoDB documentation and translating to Ansible

So now that we have our system running, we can start by making changes to our playbook. I decided to go with MongoDB, because yes. First thing I normally do is to explore installation process, in our case it is here in the MongoDB documentation. There is the list of commands we should run to install MongoDB:

$ sudo apt-get install gnupg
$ wget -qO - https://www.mongodb.org/static/pgp/server-4.2.asc | sudo apt-key add -
$ echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
$ sudo apt-get update
$ sudo apt-get install -y mongodb-org

Now we have to “translate” those commands to ansible tasks. The Ansible Module Index contains all the modules, so you can search them fast on your browser. Until you know them, it is probably better to search on your search engine using “Ansible” and the command that you want to run, for example “Ansible apt-get” and it will point you to the module that do that operation. The first one apt-get corresponds to the Ansible apt module.

Once we have done that process of all the commands, the correspondence between commands and modules is the following:

install gnupg - apt module
add mongodb asc key - apt_key module
add deb repository to apt sources - apt_repository module
update package lists - apt module
install mongodb-org - apt module

And then, those converted to tasks inside a playbook will look like this:

---
- name: install MongoDB
  hosts: database

  vars:
    ansible_user: vagrant
    ansible_private_key_file: .vagrant/machines/database/virtualbox/private_key

  tasks:
    - name: install gnupg
      become: yes
      apt:
        name: gnupg
        update_cache: yes

    - name: add apt key
      become: yes
      apt_key:
        url: https://www.mongodb.org/static/pgp/server-4.2.asc

    - name: add apt repository
      become: yes
      apt_repository:
        repo: deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse
        filename: mongodb-org-4.2.list
    
    - name: install
      become: yes
      apt:
        name: mongodb-org

Starting and configuring the database

Those steps do not start the database process, so we should it ourselves. In the MongoDB documentation they use the systemctl command, which corresponds with the service module. Add the following task at the bottom of the playbook:

- name: start service
  service:
    name: mongod
    state: started

Run the playbook again and check that the database is working. To do so you can check if the MongoDB port is opened with Netcat via the nc command. It is not a proof that the database is running OK, but at least that port is listening to requests:

nc -v -z 10.0.0.30 27017

Sorry, this should fail with ... failed: Connection refused. Do not worry, we should configure the database to accept incoming requests from origins other than localhost. However, you can enter in the virtual machine, run again the same command but against localhost (which should be ):

$ vagrant ssh database
$ nc -v -z localhost 27017
Connection to localhost 27017 port [tcp/*] succeeded!
$ exit

MongoDB by default is binding to 127.0.0.1, the IP for localhost, so only accept reqeusts from localhost. Since our first test came from outside the virtual machine, it failed. This behaviour can be configured in the /etc/mongod.conf configuration file. There should be the follow YAML block:

# [...other content...]

net:
  port: 27017
  bindIp: 127.0.0.1

# [...other content...]

We will replace that 127.0.0.1 with Ansible using the replace module. Add the following task before the start service task:

# [...previous tasks...]

- name: configure bindIp
  become: yes
  replace:
    path: /etc/mongod.conf
    regexp: 127.0.0.1
    replace: 0.0.0.0

This will replace the string 127.0.0.1, the IP for localhost, for the 0.0.0.0, the IP for all IPs, so the service will accept connections from all origins.

If you run again the playbook and check if the port is open from outside the virtual machine it will fail again. You should restart the mongod service. For now, do it from inside the virtual machine:

$ vagrant ssh database
$ sudo systemctl restart mongod
$ exit
$ nc -z -v 10.0.0.30 27017
Connection to 10.0.0.30 27017 port [tcp/*] succeeded!

Great! You now have a fully functional database. Now let’s extract variables from this playbook to a group_vars/all.yml file (at playbook level).

Extracting variables

The first thing that comes to my mind is the MongoDB version, which will be easy to extract. Create a groups_vars/all.yaml file at playbook level and add there the mongodb_version variable:

# group_vars/all.yml
---
mongodb_version: 4.2

With that, we can go to our playbook and replace the 4.2 for that variable. Here I only listed the tasks that should change, not the full playbook:

---
# [...previous content...]
    - name: add apt key
      become: yes
      apt_key:
        url: 'https://www.mongodb.org/static/pgp/server-{{ mongodb_version }}.asc'

    - name: add apt repository
      become: yes
      apt_repository:
        repo: 'deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/{{ mongodb_version }} multiverse'
        filename: 'mongodb-org-{{ mongodb_version }}.list'

# [...following content...]

You should use '{{ mongodb_version }}' to replace the value inside the strings. That syntax is coming from the Jinja2 templating engine which Ansible uses under the hood. Note that when we refer to a variable inside a string, like we do in our example, you should quote the full string (we added single quotes ' in all the strings that use the variable).

A fun part of the Ansible variables is that you can use a variable inside another variable. So we could extract even more variables in our example:

# group_vars/all.yml
---
mongodb_version: 4.2
mongodb_apt_key_url: 'https://www.mongodb.org/static/pgp/server-{{ mongodb_version }}.asc'
mongodb_apt_repository_repo: 'deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/{{ mongodb_version }} multiverse'
mongodb_apt_repository_filename: 'mongodb-org-{{ mongodb_version }}.list'

And in the playbook will be like:

---
# [...previous content...]
    - name: add apt key
      become: yes
      apt_key:
        url: '{{ mongodb_apt_key_url }}'

    - name: add apt repository
      become: yes
      apt_repository:
        repo: '{{ mongodb_apt_repository_repo }}'
        filename: '{{ mongodb_apt_repository_filename }}'

# [...following content...]

Maybe this example is not that great, maybe we are moving too much to variables. However you are the one to tell that, so be careful about extracting too much (or too less) into variables.

Now let’s move into Facts!

Facts

Ansible call facts to information about something. There are multiple modules that gather those facts and populate some variables under the hood. For example ec2_metadata_facts which “gathers facts (instance metadata) about remote hosts within ec2”, or [listen_ports_facts] which “gather facts on processes listening on TCP and UDP ports”. If you search the list of all modules you will find a lot of them.

Here I want to talk about the setup module and the gather_facts playbook “entry”. By default Ansible knows certain things about the hosts it should configure. However, we can make this explicit (because it is enabled by default), or forcing it to stop doing it. This is controlled at the playbook level by the gather_facts entry:

# random-playbook.yml
---
- name: my random playbook
  hosts: all

  gather_facts: no  # yes by default
  
  vars:
    ...

  tasks:
    ...

But what information do Ansible gather? We can check it using the setup module via command line. For our case:

ansible database -i inventory -m setup -e "ansible_user=vagrant ansible_private_key_file=.vagrant/machines/database/virtualbox/private_key" > setup_information.json

After that, I have a 766 lines file full of variables to be explored. I selected a few that are interesting:


{
    "ansible_architecture": "x86_64",
    "ansible_distribution": "Ubuntu",
    "ansible_distribution_release": "bionic",
    "ansible_distribution_version": "18.04",
    "ansible_lsb": {
        "codename": "bionic",
        "description": "Ubuntu 18.04.4 LTS",
        "id": "Ubuntu",
        "major_release": "18",
        "release": "18.04"
    },
    "ansible_os_family": "Debian",
    "ansible_processor_cores": 2,
    "ansible_processor_count": 1,
    "ansible_processor_threads_per_core": 1,
    "ansible_processor_vcpus": 2,
    "ansible_python": {
        "executable": "/usr/bin/python3",
        "has_sslcontext": true,
        "type": "cpython",
        "version": {
            "major": 3,
            "micro": 9,
            "minor": 6,
            "releaselevel": "final",
            "serial": 0
        },
        "version_info": [
            3,
            6,
            9,
            "final",
            0
        ]
    },
}

And many many more. I really like the ones that give information about processor architecture and the distribution. I use them a lot. We can use them in our example too:

# group_vars/all.yml
---
mongodb_version: 4.2
mongodb_apt_key_url: 'https://www.mongodb.org/static/pgp/server-{{ mongodb_version }}.asc'
mongodb_apt_repository_repo: 'deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu {{ ansible_distribution_release }}/mongodb-org/{{ mongodb_version }} multiverse'
mongodb_apt_repository_filename: 'mongodb-org-{{ mongodb_version }}.list'

Here we used ansible_distribution_release for the mongodb_apt_repository_repo.

Facts are really really helpful, in this case if you plan to run playbooks in different releases of Ubuntu you do not have to worry about changing the repo URL every time.

Closing

This posts took longer than I expected and it was longer than I expected too. I procrastinated a lot while doing it, so it took months to complete. I hope that this will be useful. I tried to balance direct information with chunks explaining so it is useful for those who have a little experience with Ansible and those who do not have it. All (actually “most”) of the feedback is welcome! Leave some comments if you have any trouble with the content!

Continue reading

It was useful? Done something similar? Have feedback?