Ansible & Docker - The Path to Continuous Delivery I

If I had a Rails application requiring MySQL and Redis that I wanted to host myself, this is the quickest and most simple approach. There are just 2 dependencies: Ansible & Docker. To make the introductions:

Meet Ansible, a system orchestration tool. It has no dependencies other than python and ssh. It doesn’t require any agents to be set up on the remote hosts and it doesn’t leave any traces after it runs either. What’s more, it comes with an extensive, built-in library of modules for controlling everything from package managers to cloud providers, to databases and everything else in between. If you’ve spent more time writing cookbooks rather than using them, Ansible will be your cure.

Meet Docker, a utility for creating virtualized Linux containers for shipping self-contained applications. As opposed to a traditional VM which runs a full-blown operating system on top of the host, Docker leverages LinuX Containers (LXC) which run in the same kernel, no hypervisor overhead. This results in a more efficient usage of system resource by trading some of the isolation specific to hypervisors.

 PRE-REQUISITES

The quickest way to setup Ansible on OS X:

sudo easy_install pip
sudo pip install setuptools ansible

Ansible will setup Docker on the remote system, there’s only 1 thing left to do on my local system: setup DigitalOcean integration. Ansible can control EC2, Linode & Rackspace out of the box (I am running v1.4.4). I could have chosen any of the above, but I find DigitalOcean to strike the right balance.

I have logged into my DigitalOcean account, generated a new API key and set it up in my shell environment. Follow this short tutorial to integrate Ansible with DigitalOcean’s API.

If you want to follow along, the Ansible configuration required to run all following examples is available at gerhard/ansible-docker. Let’s get started!

 CREATE INSTANCE

I have been creating tens of droplets, during different times of the day, over a period of a few days. 1 minute 25 seconds is the best time that I have clocked. This is the command that will have Ansible provision a new 2GB DigitalOcean droplet in Amsterdam 2:

bin/ap create_droplet.yml hosts -c local

The bin/ap is a bash script from gerhard/ansible-docker, it reads:

#!/usr/bin/env bash

role="${1:-base.yml}"
shift
inventory="${1:-hosts}"
shift
args=$@

ansible-playbook "$role" --inventory-file "$inventory" --module-path ./modules/sshknownhosts $args

 SETUP INSTANCE

Now that I have a fresh instance, I will set it up with some useful system packages, including dockerd & nginx, and all the system defaults which I think are sensible.

bin/ap dod.yml dohosts

For those that are wondering why I’m setting up nginx outside a Docker container, this is because I want to manage it directly with Ansible, without the Docker layer. I might spend more time optimizing this in the future, but for now it’s the straightforward approach.
This completes in 1 minute 50 seconds flat. If I really wanted to optimise the time taken to get to this state, I would create an image from this droplet and use it when creating new ones.

 FIRST RAILS APPLICATION DEPLOY

Now that I have my Docker host all set up, I will deploy a Rails application as a Docker container. This is a first time deploy.

bin/ap terrabox.yml dohosts

I cannot share the application code itself, but any Rails application requiring just MySQL or Redis will work with minor changes.

First of all, Ansible will clone the application’s git repository on the remote host. My application declares 2 service dependencies: MySQL & Redis. It does this by having 2 empty files in the repository: .docker.mysql & .docker.redis. Even if it’s a naive and limiting approach, it’s simple and works well in practice.

Once the MySQL dependency has been identified, Ansible will setup a MySQL data-only container. The MySQL data-only container doesn’t need to be running, it just needs to define a volume that will store /var/lib/mysql for the MySQL process container. When a new MySQL process container replaces the old one, as the MySQL data is stored on a volume and exposed via a data-only container, all previous MySQL data will be accessible to the new process container. More on data-only Docker containers.

The MySQL process is not important, the data is. Same for any other system service that an application depends on (eg. RabbitMQ, ElasticSearch) etc. Using Docker, my application can declare dependencies on other services and it can trust Docker to make them available in a reliable and timely fashion. I do not need the production data when I run my tests. I should not use production data in staging either. Data security, out-of-sync issues and sheer size are a few good reasons. In all environments but production the focus should be on the correctness of the functionality, not the data.

To start a MySQL process container, first the MySQL Docker image needs to be downloaded. Because the MySQL image that I’m using depends on the ubuntu:12.04 image, this needs to be downloaded as well. Once Docker images are downloaded, they will be cached locally, so starting other containers that depend on the same images will be almost instant.

Once the MySQL process container is up and running, the same needs to happen for Redis, the second dependency for this Rails app.

Now that both container dependencies have been met, the correct Docker Ruby image needs to be downloaded before the application can build its own Docker image. From the master branch, I am building a Docker app:master image. This will be re-used when building images for other branches (eg FROM app:master). Let’s take a look at my application’s Dockerfile:

FROM howareyou/ruby:2.0.0-p353

RUN  rm -fr /terrabox

ADD ./ /terrabox

RUN \
  . /.profile ;\
  rm -fr /terrabox/.git ;\
  cd /terrabox ;\
  bundle install --local ;\
  echo '. /.profile && cd /terrabox && RAILS_ENV=test bundle exec rake db:create db:migrate && bundle exec rspec' > /test ;\
# END RUN

CMD . /.profile && cd /terrabox && export RAILS_ENV=production && rake db:create db:migrate && foreman start

EXPOSE 3000

If there is a /terrabox directory in the container, I want all files to be deleted. The ADD command only adds new files and overwrites existing ones, it does not delete files which no longer exist on the target. This will make more sense when I will build an application Docker image from a different git branch.

I am using a multi-line RUN command as I do not want any of those commands to run individually. You will also notice that I have create a test shell script which will be used for the test-only container. After the gems get installed and the docker image gets built, all the tests need to run. This is important. I want a deploy to fail if any of the tests fail. Because I’m running my application tests in a container based on the same image as the final production container, I am confident that if my tests pass, that version of my application will work correctly in production. If tests pass, I start an application container straight after and add it to the load balancer. If tests fail, the test container is left behind for troubleshooting purposes.

The quicker I can get my code in front of real users, the quicker I can learn about what makes them happy and what makes them sad.

Setting up a Rails application as a Docker container with dependencies on MySQL and Redis, without any local Docker images, takes a whole 8 minutes 16 seconds. Regardless how fast Docker is to spin up containers once the images exist, there will always be the application packaging step which needs to run. Docker makes the whole process more efficient, as I’ve described above, but it’s no pixie dust. Once the application Docker image is built and I am confident that the application behaves correctly (ie all tests pass), distributing this resulting image to all other hosts and spinning up containers should be blisteringly quick. I still have some work to do before I can talk about multi-host Docker setups confidently with hard numbers to back me up, but that’s what future posts are for.

Before I move on, there are a 2 things which I want to point out:

  1. Dependent containers such as MySQL and Redis are branch-specific. This means that there will only be 1 MySQL container for an application’s master branch. Any application containers built from the master branch will always use this MySQL container. Same goes for Redis and all future dependent containers.

  2. When an application container gets started from the master image, that container is production. The most recent master image would be the current production. I will come back to this in a second, but first, there are gems which need updating…

 RAILS APPLICATION BRANCH DEPLOY

There are gem updates which I want to try out before updating production. As MySQL, Redis & Ruby images are already there, these steps take seconds vs minutes. There’s also the application’s master image on top of which I am basing my branch image.

bin/ap terrabox.yml dohosts -e 'app_branch=gem-updates'

To build an application branch image, using the same process, it takes 2 minutes 23 seconds. That is a 3.46x improvement over the initial 8 minutes 16 seconds.

Let’s talk about environments a little bit. Everyone has production, right? Same for development… How about integration? Staging? QA? Beta? Mooo? Seriously, what the fuck? What really matters is if my users can use it, and if not, how quickly can I get it in front of them. The quicker I can get code out there, show it, use it, measure it, the quicker I can validate my assumptions about “good ideas”. While almost everything starts as a good idea, very few stand the test of time.

Git branches are brilliant. Github pull requests add the element of discussion on top of git branches and make them even more powerful.

I want to stop talking about which branch gets deployed to which environment. I care about seeing the latest version somewhere publicly accessible. Why would I care about how many integration environments there are? Why would I care about staging? Why would I care that an integration environment is locked? What does it even mean for it to be locked?

I find that using the branch name as the sub-domain is very practical. I like that because all the code from that branch is publicly available. Because the dependent services (MySQL and Redis) are branch specific, I don’t have to worry about breaking production or messing up production data. We, the ones working on an application already think in terms of branches. So why not apply the same model which comes so naturally when it comes to deploys? This approach might not work for everyone, but it’s my favourite approach. As always, YMMV rule still applies.

 NEW MASTER DEPLOY

OK. Now that I’m happy with my branch, I’ve merged it into master and I’ve restarted the master image build.

bin/ap terrabox.yml dohosts

This now takes 4 minutes 45 seconds vs the initial 8 minutes 16 seconds. In real terms, this is a 1.74x improvement over the no Docker images situation. The reason why it takes this long is because I always start from a clean slate when it comes to building master. I want a single gem version and I don’t want gems left behind that my application isn’t using anymore. As you can tell, I like determinism.

During some discussions on this subject, a more sensible approach would be to re-build the master image from scratch at regular intervals rather than on every deploy. This will make deploys 2x as fast (2 minutes 23 seconds vs 4 minutes 45 seconds). I am switching to this model now, will follow-up on the idea.

 THINGS I LIKE

 WHAT’S NEXT

Before I can orchestrate a multi-host Docker setup, I need to have a private registry for my images. Whilst quay looks like a good solution, I think I would prefer at this stage to try running my own private registry, via the existing docker-registry image.

I need to stop and clean up containers and images which are no longer required. The space adds up really quickly when I ramp up branches & deploys. I am using an ad-hoc Ansible command for this, but should really package it in a role task.

Backing-up & restoring data-only containers. A real production environment is unthinkable without proper backups.

When a new application Docker container starts, the previous one which is still running, should be left as a backup upstream entry in the load balancer. Until the new container can start serving requests, it’s critical for a 0-downtime deployment to have the previous one still serving traffic.

Deploys should be triggered by GitHub hooks. I’m running Ansible plays manually now, and that’s so not Continuous Delivery-like.

Containers should be supervised by either systemd or supervisord. As it turns out, I haven’t been using the exec command when starting processes in containers, no wonder I had issues when trying to integrate the two.

 THE ANSIBLE SETUP

If you have suggestions on how to improve the Ansible roles, I am very keen to hear. This will become a lot more involved when I’ll start leveraging its true power: orchestration. The next step is multi-host Docker setup, no production environment is complete if it runs on a single physical host.

 IN CONCLUSION

Just like git snapshots code and transports it in the most efficient manner possible, so Docker does with system processes. It guarantees that everything required to run that process will be available regardless of the host that it runs on. The process now has a social side, it comes with friends which guarantee that the party is always on : ). It also means that you can have variations of those processes, running side-by-side, on the same physical host, with no hypervisor penalty.

Ansible completes Docker in ways which make the two a killer combination. Without leaving the local shell environment, I can orchestrate what needs to happen remotely, from starting new instances to setting up Docker, to cloning a git repository, building Docker images, starting Docker containers in a controlled fashion, all with minimal effort. Both make me excited beyond words. I’m a child again, discovering the shell, writing my first for loop… thank you for making infrastructure fun again : ).

This blog post formed the basis of my talk given at the London Docker meetup on 29th of January 2014.

 
324
Kudos
 
324
Kudos

Now read this

Embracing constraints

The shell environment is usually regarded as nothing more than a simple prompt for passing our keyboard input to system services that do the heavy-lifting. Peel back some layers and you will discover a functional programming language... Continue →