Traefik and Lets Encrypt with Consul and ConsulCatalog

In this post we will be exploring the deployment of Traefik in a multi-node Docker Swarm environment, using Consul for service configuration and Lets Encrypt for HTTPS SSL certification provision.

The gotcha factor here is getting a multi-node Traefik service to use the same Lets Encrypt certificate store, so you don't keep hammering the Lets Encrypt servers each time a different node is hit.

The issue in a Docker Swarm is that storage is not shared across nodes - the standard local volume driver is local to each node, meaning data isn't shared, so you either have to resort to one of the more complex volume drivers (which invariably require a backend like AWS, Azure or something equivalent locally), or something like NFS which is insecure in this day and age.

Here, I am going to show you how to set up Traefik in such a way that it uses the Consul service catalog for its service discovery and configuration, but also so it uses the Consul key-value store to store the Lets Encrypt certificates so that all the Swarm nodes have access to the same information.

Prerequisites

  • You must have a Docker Swarm installed and configured to properly run Docker in a multi node setup

Consul...

First, lets set up Consul - if you already have Consul running, congratulations, you can skip this step :)

We will setup Consul directly on a Docker host, not as part of the Swarm (you could run it on the Swarm, but I don't for various reasons) - ideally you would setup several Consul instances on several hosts and they would talk to each other.

Run the command:

docker run \  
    -d \
    --restart=always \
    --name consul \
    --net=host \
    -e 'CONSUL_LOCAL_CONFIG={"skip_leave_on_interrupt": true}' \
    -v /usr/data/consul:/consul/data \
    consul agent -server \
    -client=<IP OF HOST>\
    -bind=<IP OF HOST>\
    -ui

Lets break that down (I shall skip the ones we have already covered):

  • --net=host - don't use a private Docker network, bind this container directly to the hosts network
  • -e 'CONSUL_LOCAL_CONFIG={"skip_leave_on_interrupt": true}' - give some configuration to consul via an environment variable
  • -v /storage/consul:/consul/data - lets persist the data that consul stores
  • consul agent -server - run consul in production mode as a server
  • -client=<IP OF HOST> - give it the IP address that the client aspects need to bind to
  • -bind=<IP OF HOST> - give it the IP address that the server aspects need to bind to
  • -ui - run the consul UI, so you can inspect the data

Once the container is running, you can browse to http://<IP_OF_CONSUL_HOST>:8500/ to access the UI.

There is no security on Consul as setup here, you should look into setting up its ACL security if you are going to use this in production or on a insecure network.

Traefik

There are three steps to setting up Traefik so that it uses the Consul service catalog for service discover and also the key-value store for storing the Lets Encrypt certificates.

Firstly, you need to pre-populate the key-value store with Traefiks configuration - here we are going to populate it with an exact copy of the configuration we are going to use, but we are really only interested in creating the right keys for use later (the actual Traefik configuration we will still supply on the command line).

So, run the command:

docker run  traefik:1.1.2 \  
        storeconfig \
        --consul \
        --consul.prefix="traefik" \
        --consul.watch \
        --consul.endpoint="<IP_OF_CONSUL_HOST>:8500" \
        --consulcatalog=true \
        --consulcatalog.endpoint="<IP_OF_CONSUL_HOST>:8500" \
        --consulcatalog.constraints="tag==public" \
        --entryPoints='Name:https Address::443 TLS' \
        --entryPoints='Name:http Address::80' \
        --acme.entrypoint=https \
        --acme=true \
        --acme.ondemand=true \
        --acme.onhostrule=true \
        --acme.email="youremailaddress@example.com" \
        --acme.storage="traefik/acme/account" \
        --web \
        --web.address=":8089"

As ever, lets break that down:

  • traefik:1.1.2 - lets use a specific version of the official Traefik image
  • storeconfig - this tells Traefik not to actually run, but just to populate the chosen backing store with the configuration
  • --consul - enable use of the consul key-value store
  • --consul.prefix="traefik" - ensure we have a unique prefix so things dont get muddled
  • --consul.watch - watch consul for changes
  • --consul.endpoint="<IP_OF_CONSUL_HOST>:8500" - point Traefik at the consul service, fill in the IP address here
  • --consulcatalog=true - enable use of the consul catalog as a service configuration store
  • --consulcatalog.endpoint="<IP_OF_CONSUL_SVC>:8500" - point Traefik at the consul service, fill in the IP address here
  • --consulcatalog.constraints="tag==public" - only use services which have this constraint (allows you to only make public what you want to be made public, but still use consul for lots of other things)
  • --entryPoints='Name:https Address::443 TLS' - tell Traefik to use HTTPS as an entry point on port 443
  • --entryPoints='Name:http Address::80' - tell Traefik to use standard HTTP as an entry point on port 80 (not strictly needed for our example, but included for completeness)
  • --acme.entrypoint=https - tell Traefik to do Lets Encrypt certification requests on HTTPS
  • --acme=true - turn Lets Encrypt support on
  • --acme.ondemand=true - tell Traefik to do Lets Encrypt when it needs to
  • --acme.onhostrule=true - tell Traefik to do Lets Encrypt on the basis of information supplied in the host rule
  • --acme.email="youremailaddress@example.com" - your email address to submit to Lets Encrypt
  • --acme.storage="traefik/acme/account" - where to store the Lets Encrypt certificates within the consul key-value store
  • -web - starts Traefiks web UI so you can see whats going on
  • --web.address=":8089" - tells Traefik what port to put its UI on

Once this command has run, the container will exit and you can delete it - we don't need it any more.

Important step

Don't skip this step, stuff won't work otherwise.

Browse to the web UI of your Consul service, go into the key-value store area and delete the following key:

/traefik/acme/accounts/storageFile

If this key exists when we try and run Traefik, Traefik will exit with an error about "Error creating TLS config Empty Store, please provide a key for certs storage" (see https://github.com/containous/traefik/issues/927 for more information).

Run Traefik

Ok, so now we are ready to run Traefik on the Swarm:

docker service create --name ingress-lb \  
        -p 80:80 -p 443:443 -p 8089:8089 \
        --mode global \
        traefik \
        --consul \
        --consul.prefix="traefik" \
        --consul.watch \
        --consul.endpoint="<IP_OF_CONSUL_SVC>:8500" \
        --consulcatalog=true \
        --consulcatalog.endpoint="<IP_OF_CONSUL_SVC>:8500" \
        --consulcatalog.constraints="tag==public" \
        --logLevel=DEBUG \
        --entryPoints='Name:https Address::443 TLS' \
        --entryPoints='Name:http Address::80' \
        --acme.entrypoint=https \
        --acme=true \
        --acme.ondemand=true \
        --acme.onhostrule=true \
        --acme.email="youremailaddress@example.com" \
        --acme.storage="traefik/acme/account" \
        --web \
        --web.address=":8089"

Its pretty much the same as the last command, except this time we are creating the service. I include all the configuration again for documentation purposes.

The differences are:

  • docker service create - tell Docker to create the service on the Swarm
  • --name ingress-lb - give the service a sensible name
  • --mode global - tell it to run on all nodes in the Swarm

And thats it - Traefik should now be running on all nodes, and it will be checking the Consul key-value store for its ACME certificates.

There are a few ways to register your backend services with the Consul service catalog, but a quick and dirty way is to create a json file looking similar to:

{
  "Name": "yourservicename",
  "Address": "<IP_ADDRESS_OF_SERVICE>",
  "Port": <PORT_OF_SERVICE>,
  "Tags": [
    "traefik.tags=public",
    "traefik.frontend.rule=Host:<PUBLIC_DOMAIN_HOST>",
    "traefik.frontend.entryPoints=http,https"
  ]
}

Save it out as yourservicename.json and then submit it to Consul using curl:

curl --upload-file yourservicename.json http://<IP_OF_CONSUL_SVC>:8500/v1/agent/service/register

Ideally, a service should register itself, but this is handy for quick and dirty standalone services.

There you have it, a quick guide to setting up Traefik and Lets Encrypt using Consul and ConsulCatalog.