Why We Chose Ansible for Infrastructure as Code

Why We Chose Ansible for Infrastructure as Code

We’ve always believed in building infrastructure like we write software: version-controlled, repeatable, and disposable when needed.

For two years, we had credits on AWS, and during that time, we manually spun up servers and ran a self-hosted stack manually:

  • Nomad + Consul + Vault for orchestration and secrets
  • NGINX for reverse proxying
  • MongoDB, PostgreSQL, and Redis for various services
  • GitLab for source control and CI/CD
  • Ghost for blogging

But we weren’t tied to AWS services like EKS, RDS, or Elasticache.

Everything ran on plain EC2 machines, because deep down we knew this day would come — the day our credits would vanish.

The Migration: AWS to GCP

When our credits ran out, we reached out to GCP and got accepted into their credit program.

We now had a new ground, and it was time to shift infra from AWS to GCP.

Some things had to move manually (like GitLab, because it’s picky and stateful), but for everything else — Nomad, Consul, NGINX, and our databases — we decided to finally codify the process.na

Here’s a slightly biased rewrite tilted in favor of Ansible while keeping it honest and startup-friendly:

Why Ansible?

We didn’t want to spin up a whole infra language or deal with state files and cloud-specific syntax. We wanted something that just works — fast. Ansible checked every box:

  • No agents, no daemons, no drama: SSH is all you need.
  • YAML all the way: Human-readable, version-controllable playbooks that even our backend devs could tweak without fear.
  • Roles make sense: Modular, reusable, and easy to share across projects.
  • Massive community: Ansible Galaxy has a role for nearly everything you’d Google.
  • You stay in control: No “magic” — every task runs exactly how and when you want. Perfect for small teams that need transparency.

For us, Terraform felt a little heavyweight when we just wanted to spin up VMs and get our apps running. Ansible let us move faster without sacrificing control.

Getting Started with Ansible

Here’s how we structured our Ansible project:

ansible
├─ README.md
├─ ansible.cfg
├─ hosts.ini
├─ install_ansible.sh
├─ requirements.yml
├─ roles/
│  └─ nomad-client-and-server/

Configuring the Rules – ansible.cfg

Our config file makes sure Ansible runs the way we like:

[defaults]
stdout_callback = yaml
stderr_callback = yaml
inventory = hosts.ini
host_key_checking = False
timeout = 30
forks = 10
pipelining = True

[ssh_connection]
pipelining = True

Key settings explained:

  • stdout_callback = yaml: Cleaner logs.
  • inventory = hosts.ini: Where Ansible should look for our server IPs.
  • pipelining = True: Speeds up SSH by avoiding sudo temp file creation.
  • host_key_checking = False: Good for automation where host keys might change.

Defining Our Infrastructure - hosts.ini

This file tells Ansible which servers to target and how:

[nomadclientandservers]
master-new ansible_port=22 ansible_host=222.222.222.222 server_name=master-new internal_ip=111.111.111.111

[nomadclients]
nats03 ansible_port=22 ansible_host=333.333.333.333 server_name=nats03 internal_ip=222.222.222.222

[nomadclientandservers:vars]
ansible_user=root
ansible_private_key_file=/home/lovestaco/.ssh/abc.txt
ansible_python_interpreter=/usr/bin/python3

[nomadclients:vars]
ansible_user=root
ansible_private_key_file=/home/lovestaco/.ssh/abc.txt

[local]
localhost ansible_connection=local

What this does:

  • Groups servers (nomadclientandservers, nomadclients)
  • Assigns specific user and SSH key per group
  • Allows custom per-host vars like internal_ip

Installing Dependencies - requirements.yml

We grabbed a few roles from Galaxy to make life easier:

roles:
  - src: geerlingguy.nginx
  - src: geerlingguy.certbot
  - src: googlecloudplatform.google_cloud_ops_agents

collections:
  - name: community.general

Install them via:

ansible-galaxy install -r requirements.yml

Creating Our Own Roles

We started small. Here's how we created a role to install and configure Nomad:

ansible-galaxy init roles/nomad-client-and-server --offline

That gave us this skeleton:

roles/
└─ nomad-client-and-server/
   ├─ tasks/
   │  ├─ install_nomad.yml
   │  ├─ configure.yml
   │  └─ main.yml
   ├─ templates/
   │  ├─ nomad_client.hcl.j2
   │  ├─ nomad_server.hcl.j2
   │  └─ nomad_nomad.hcl.j2
   ├─ handlers/
   ├─ defaults/
   ├─ vars/
   ├─ meta/
   ├─ tests/
   └─ README.md

We wrote all the steps we used to do manually — install Nomad binary, set up the config files, enable the service — inside tasks/. Config files go into templates/ with Jinja2 templating.

In our case, the tasks/main.yml acts as an entry point that pulls in two key task files:

# tasks file for roles/nomad
- import_tasks: install_nomad.yml
- import_tasks: configure.yml

This keeps the logic clean and separated — install_nomad.yml handles the installation, and configure.yml takes care of templating configs and restarting Nomad.

install_nomad.yml

This task file does everything needed to install Nomad from scratch:

  • Installs prerequisites like aptitude, curl, and Python packages.

  • Adds HashiCorp's official GPG key and apt repo securely:

    ansible.builtin.get_url:
      url: https://apt.releases.hashicorp.com/gpg
      dest: /etc/apt/keyrings/hashicorp-archive-keyring.asc
    
  • Adds the Nomad repo using apt_repository.

  • Updates apt cache and installs the nomad binary.

Every important action is registered and debugged using Ansible’s debug module, so we can inspect what’s happening per task:

- name: Debug nomad install result
  debug:
    var: install_nomad_result

This is extremely helpful during first-time runs and troubleshooting.

configure.yml

This one’s all about configuration. We use Ansible’s template module to render .j2 (Jinja2) files with variables (if any) into final .hcl configs.

It writes:

  • nomad.hcl – the core config from nomad_nomad.hcl.j2
  • server.hcl – the server-specific config
  • client.hcl – the client-specific config

Each of these templates lives in the templates/ folder and uses Jinja2 syntax, even though in our case, they’re mostly static for now.

After the configs are written, we restart the Nomad service with:

- name: Restart Nomad
  ansible.builtin.service:
    name: nomad
    state: restarted

This ensures that any config changes take immediate effect.

All steps are wrapped with register, failed_when, and ignore_errors: false to fail loud and early if something breaks.

Jinja2 Templates in Action

We use .j2 template files to store our Nomad HCL config files inside the templates/ directory. This is where Jinja2 shines.

Example: nomad_nomad.hcl.j2

# Increase log verbosity
log_level = "INFO"

data_dir = "/home/nomad"
datacenter = "dc1"

vault {
  enabled = true
  address = "https://abc.com"
  ca_cert = "/etc/vault.d/tls/tls.crt"
  create_from_role = "nomad-cluster"
  token = "hvs.askld-..."
}

This config makes Nomad talk to Vault securely and defines basic operational defaults.

Example: nomad_server.hcl.j2

server {
  enabled = true
  bootstrap_expect = 1
}

data_dir = "/home/nomad/data"

plugin "docker" {
  config {
    allow_privileged = true
  }
}

This template enables Nomad's server mode and configures the Docker plugin to allow privileged containers — important for running certain jobs.

Even though these files are mostly static right now, having them in Jinja2 format makes it trivial to later inject variables per environment or host using {{ variable_name }} syntax.

What We Gained

By using Ansible:

  • 💣 No more “wait, which commands did I run on that server again?”
  • ✅ Every server can be set up the same way every time.
  • 🧠 Reuse knowledge and roles across environments.
  • 💾 Code our infrastructure just like we code our features.

What started as a couple of playbooks to avoid repetitive setup quickly grew into a full-blown automation stack. We kept building more roles, fine-tuning tasks, and wiring things together — now we can bootstrap nearly any piece of our infra with a single command.

Here’s what our Ansible repo looks like:

hex-ansible/
├── ansible.cfg                       # Global Ansible settings
├── hosts.ini                         # Inventory of our servers
├── requirements.yml                  # External roles and collections
├── README.md
├── install_ansible.sh                # Bootstrap script to install Ansible
├── route.py                          # Handy utility script
│
├── *.yml                             # Playbooks for individual components:
│   ├── ghost.yml
│   ├── consul.yml
│   ├── nomad-client-playbook.yml
│   ├── nomad-client-and-server-playbook.yml
│   ├── nginx-build-playbook.yml
│   ├── nginx-conf-sync-playbook.yml
│   ├── server-tools-playbook.yml
│   ├── cron-master.yml
│   └── dba.yml
└── roles/                            # Modular roles we’ve built:
    ├── consul
    ├── cron-master
    ├── db
    ├── gcloud
    ├── gcloud-nomad-security-group
    ├── ghost
    ├── nginx-conf-sync
    ├── nginx-with-consul-module
    ├── nomad-client
    ├── nomad-client-and-server
    ├── server-tools
    └── ssh-config

What’s Next

We’re actively expanding our setup — with roles for Redis, Vault, GitLab backup, and more. The goal is to bring up the entire environment — from base OS setup to services — with a single orchestrated run.

If you’re still copy-pasting commands from a Notion doc, give Ansible a spin.
You might just fall in love with infra again.

Let us know about your cool stuff with ansible in the comments

Hi there! I'm Maneshwar. Right now, I’m building LiveAPI, a first-of-its-kind tool that helps you automatically index API endpoints across all your repositories. LiveAPI makes it easier to discover, understand, and interact with APIs in large infrastructures.

LiveAPI helps you get all your backend APIs documented in a few minutes.

With LiveAPI, you can generate interactive API docs that allow users to search and execute endpoints directly from the browser.

LiveAPI Demo

If you're tired of updating Swagger manually or syncing Postman collections, give it a shot.