Table of Contents

Introduction

Changelog

  • 2024-08-24: Init for Ubuntu 24.04
  • 2024-09-10: Add writeup on the general structure and concepts in Ansible

Getting started

Ansible is an orchestration tool for performing remote code execution on a networked client, by relying on the remote Python interpreter. By default it works by:

  1. Opening an SSH connection to the remote
  2. Transporting execution code over the connection (i.e. agent-less)
  3. Executing said code using the remote Python
  4. Execution results are collected as JSON output, and returned

Installation

These executable programs (modules) are run as individual tasks. A first test is usually sending a script to localhost that writes an output string: this is provided with the built-in ping module1).

user:~$ ansible localhost -m ping
localhost | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Explanation

We can create a YAML declarative script (playbook) to repeat this command, with abbreviations and defaults explicit for clarity:

user:~$ cat mytasks.yml
- hosts: localhost
  tasks:
    - ansible.builtin.ping:
        data: pong
user:~$ ansible-playbook mytasks.yml 

This is pretty much the core idea of what Ansible is designed to do, and grouping of tasks into playbooks. There is a set of best practices that is likely a good read.

Playbooks and modules

Ansible has a set of reserved keywords for use in playbooks, which can be found here. Learn the common ones to avoid mistaking modules from keywords3). The next most common keyword is arguably name for labelling tasks (and coexists with playbook comments):

mytasks.yaml
# For hello-world equivalent tutorial
- name: Connectivity check with localhost
  hosts: localhost
  tasks:
    - name: Run ansible ping
      ansible.builtin.ping:
        data: pong

Most modules run through the Python interpreter (Ansible is designed to work with Python for easier scripting). To run shell commands, one of the command, shell, or raw modules should be used4), in order of decreasing preference:

Note however that the use of shell scripting means error handling still needs to be done, i.e. it takes away from what makes Ansible easy to use due to its declarative nature. Use the right tool for the right job.

Common tasks are likely to have already been implemented by the community as a module. To search for modules, consider one of these methods:

Organization concepts

Ansible defines a couple more terms to allow for more effective categorization of tasks, namely "inventories", "plays", "collections", "roles".

Inventories group hosts into a file, with additional subgroup delineation. This can then be specified as a possible host in the playbook to run tasks against, e.g. "hosts: local", or "hosts: all" to run against all hosts in the inventory. The default inventory is defined in /etc/ansible/hosts.

user:~$ cat myinventory
[local]
localhost
 
[webserver]
192.168.1.12
webserver.internal
 
user:~$ ansible-playbook -i myinventory mytasks.yml  # 'hosts:' need to be changed

Playbooks can contain many independent groups of tasks. Each group is called a play. This can be used to run different tasks across different hosts, e.g. running a playbook updating all machines to update webservers and database servers separately. In short, they tie tasks to a set of hosts.

- name: Update web servers
  hosts: webservers
  remote_user: root
  tasks:
    - name: Ensure apache is at the latest version
      ansible.builtin.yum:
        name: httpd
        state: latest

- name: Update db servers
  hosts: databases
  remote_user: root
  tasks:
    - name: Ensure postgresql is at the latest version
      ansible.builtin.yum:
        name: postgresql
        state: latest

Modules are typically not standalone, e.g. managing a webserver may involve modules for updating, starting, stopping, etc. These are typically grouped and distributed as a collection of modules. These collections reside in namespaces, with some special ones being "ansible", "community", "local".

# List available collections
user:~$ ansible-galaxy collection list
 
# Show documentation for a module
user:~$ ansible-doc community.grafana.grafana_dashboard

Only a subset of modules in a collection need to be run, in order to, say, setup an NTP service. This is where Ansible roles come in, which are essentially like a self-contained group of tasks, and also additionally enforces an organization structure to store variables and template files for reuseability.

Summary of concepts

  • Playbooks contain a set of plays that runs tasks using modules.
  • Playbooks reference hosts collected in an inventory.
  • Groups of tasks are consolidated into roles.
  • Roles and modules are distributed in a collection, under a predefined namespace.

Other resources

Consider Geerling's book: Ansible for DevOps, which seems to be popular in this circle. Also a small commentary on why Ansible, reproduced below:

As someone who has used all of them (yes all of them, puppet, chef, ansible, salt, cfengine). They each have pros and cons. The best one is the one that fits your organization business requirements.

* If you are a windows shop and you can pay for enterprise, then chef is the most mature.
* If you are a linux shop with money, then go for RedHat Ansible Automation Platform.
* If you are a home user or don't have money, then open source Ansible is the easiest to setup.

However the industry is moving to immutable infrastructure where you don't even need to worry about VMs at all. In that case Terrafom, Pulumi, Helm are the go-to tools.

- u/dev_all_the_ops (Dec 2023)

Lastly, a quick example of using Ansible for VM deployment on Proxmox.

1)
Note some tutorials specify ansible -m ping localhost, but the convention is to have the host pattern be the first argument
2)
This is partly why I took so long to pick up motivation to learn Ansible: many tutorials jump straight into playbooks and rules without actually dissecting where this module was coming from, or why the return value is a pong. It took me trial-and-error to even figure out -a was the required parameter and its "key=value" format. The main confusion behind plain "ansible" is arguably the loose syntax rules, e.g. ansible localhost -m "ping data=pong" is also valid, without the use of the module arguments parameter.
3)
Note that special variables that are non-user-assignable also exist
4)
belonging to the ansible.builtin collection
5)
similar to Python subprocess.run()
6)
similar to Python subprocess.run(shell=True)
7)
some sources online hint at roles being deprecated, but this is really more of standalone roles being deprecated in favor of packaging them as part of collections - not the same thing!