Use and automate letsencrypt certificates (ACME) in an high availability environment
Mozilla launched a “free, automated and open” certificate authority called Let’s encrypt. As the name suggests, it provides free certificates trusted by all (major) browsers and operating systems. I’m using it heavily (on this blog, for example).
This blog post shows how Syncthing can be used to deploy letsencrypt certificates in an environment with multiple servers (e.g. in a round-robin scenario) without adding a single-point-of-failure.
ACME
Let’s encrypt automated the process of requesting and authenticating a certificate using a protocol called ACME. The client requesting a new certificate uses a .well-known
path on its webserver where it places a challenge, and Let’s encrypt retrieves this challenge for authentification.
The actual process is a little more complicated, though. If you want to know how it works in detail, I recommend Let’s encrypt’s excellent ACME documentation.
The problem in high availability setups
When using multiple servers for SSL termination (e.g. in the load-balancing scenario described in the picture below, where SSL termination is handled by the nginx instances) each one requires a certificate for the domain(s) they are serving.
In a setup that e.g. uses a round-robin, we can’t guarantee that the incoming request for the ACME challenge ends up on the server actually requesting the certificate. Furthermore, each server needs to request (and renew) its own certificates.
The cleanest solution I found for this problem is to share the .well-known
challenge directory (and maybe even the certificate) between multiple servers.
Syncthing to the rescue!
The tool I found best to syncronize those directories was Syncthing. It is one of the most exiting tools for file-sharing, as it is completely decentralized and works without any central server (but can be configured to use one, if required), is fully peer-to-peer, open-soure, written in Go and cross-platform.
Syncthing fulfills all items on my wishlist:
- Traffic between the instances is encrypted
- The setup is automatically deployable
- Instances can be easily added or removed
- No single-point-of-failure (all nodes connect to each other, syncronizing the same directory between all machines)
- No additional services required
I chose it to syncronize the /etc/nginx/certs
directory. It shares the dhparams, SSL certificates and the ACME challenges between all nginx instances. Here’s what the shared directory looks like:
$ tree -a
.
├── .stfolder
├── acme
│ └── .well-known
│ └── acme-challenge
│ ├── 8xdoeH5OLPUij4xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
│ ├── cWaLNpzt_8v--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
│ └── _wsvWIOyvP-45vt-xxxxxxxxxxxxxxxxxxxxxxxxxxx
├── dhparam.pem
├── www.example.com.crt
└── www.example.com.key
3 directories, 7 files
Implementation
We’re using Chef to automate our infrastructure at flinc, but the process should be easily adaptable to a automation tool of your choice.
Syncthing is easily deployed, as there’s an official repository available:
Install Syncthing
# Use the official Syncthing apt repositories
apt_repository 'syncthing-release' do
uri 'http://apt.syncthing.net'
distribution 'syncthing'
components %w(release)
key 'https://syncthing.net/release-key.txt'
end
package 'syncthing'
Set up systemd service
A systemd.service
is quickly crafted from the provided example:
[Unit]
Description=Syncthing - Open Source Continuous File Synchronization for %I
Documentation=man:syncthing(1)
After=network.target
[Service]
User=%i
ExecStart=/usr/bin/syncthing -no-browser -no-restart -logflags=0
Restart=on-failure
SuccessExitStatus=3 4
RestartForceExitStatus=3 4
[Install]
WantedBy=multi-user.target
Generate Syncthing configuration and keys
We want to centrally manage our instances, so Syncthing certificates are stored centrally in Chef’s encrypted data bags, alongside their device IDs and API keys. Here’s how to generate and extract everything that’s required:
First, generate a new key-pair and save the device ID and API key for each node:
NODE=1.nginx.example.com
syncthing --generate=$NODE |grep ID |awk '{ print $5 }' > $NODE/device_id
grep apikey $NODE/config.xml |cut -d\> -f2 |cut -d\< -f1 > $NODE/apikey
rm $NODE/config.xml
The resulting key.pem
and cert.pem
will then be deployed into the .config/syncthing
directory on the target machine.
After using Syncthing’s web-interface to configure the share, the resulting config.xml
was then used to craft the following ERB template:
<configuration version="16">
<!-- This is our shared folder. Scan it every 5s, so updates are syncronized quickly -->
<folder id="vm623-hxlsp" label="letsencrypt" path="/etc/nginx/certs/" type="readwrite" rescanIntervalS="5" ignorePerms="false" autoNormalize="true">
<!-- Share the folder between all nodes -->
<% @nodes.each do |name, config| %>
<device id="<%= config['id'] %>"></device>
<% end %>
<!-- Share settings. Default settings, with simple versioning -->
<minDiskFreePct>1</minDiskFreePct>
<versioning type="simple">
<param key="keep" val="5"></param>
</versioning>
<copiers>0</copiers>
<pullers>0</pullers>
<hashers>0</hashers>
<order>random</order>
<ignoreDelete>false</ignoreDelete>
<scanProgressIntervalS>0</scanProgressIntervalS>
<pullerSleepS>0</pullerSleepS>
<pullerPauseS>0</pullerPauseS>
<maxConflicts>10</maxConflicts>
<disableSparseFiles>false</disableSparseFiles>
<disableTempIndexes>false</disableTempIndexes>
</folder>
<!-- Make sure all nodes are connected to one another -->
<% @nodes.each do |name, config| %>
<device id="<%= config['id'] %>" name="<%= name %>" compression="metadata" introducer="false">
<address><%= config['address'] %></address>
</device>
<% end %>
<gui enabled="true" tls="false" debugging="false">
<address>127.0.0.1:8384</address>
<apikey><%= @apikey %></apikey>
<theme>default</theme>
</gui>
<options>
<listenAddress>default</listenAddress>
<!-- Disable announcement, as we're automatically adding all servers above -->
<globalAnnounceServer>default</globalAnnounceServer>
<globalAnnounceEnabled>false</globalAnnounceEnabled>
<localAnnounceEnabled>false</localAnnounceEnabled>
<localAnnouncePort>21027</localAnnouncePort>
<localAnnounceMCAddr>[ff12::8384]:21027</localAnnounceMCAddr>
<maxSendKbps>0</maxSendKbps>
<maxRecvKbps>0</maxRecvKbps>
<reconnectionIntervalS>60</reconnectionIntervalS>
<relaysEnabled>false</relaysEnabled>
<relayReconnectIntervalM>10</relayReconnectIntervalM>
<startBrowser>false</startBrowser>
<natEnabled>false</natEnabled>
<natLeaseMinutes>60</natLeaseMinutes>
<natRenewalMinutes>30</natRenewalMinutes>
<natTimeoutSeconds>10</natTimeoutSeconds>
<urAccepted>1</urAccepted>
<urUniqueID></urUniqueID>
<urURL>https://data.syncthing.net/newdata</urURL>
<urPostInsecurely>false</urPostInsecurely>
<urInitialDelayS>1800</urInitialDelayS>
<restartOnWakeup>true</restartOnWakeup>
<autoUpgradeIntervalH>12</autoUpgradeIntervalH>
<keepTemporariesH>24</keepTemporariesH>
<cacheIgnoredFiles>false</cacheIgnoredFiles>
<progressUpdateIntervalS>5</progressUpdateIntervalS>
<symlinksEnabled>true</symlinksEnabled>
<limitBandwidthInLan>false</limitBandwidthInLan>
<minHomeDiskFreePct>1</minHomeDiskFreePct>
<releasesURL>https://upgrades.syncthing.net/meta.json</releasesURL>
<overwriteRemoteDeviceNamesOnConnect>false</overwriteRemoteDeviceNamesOnConnect>
<tempIndexMinBlocks>10</tempIndexMinBlocks>
</options>
</configuration>
Deploy Syncthing configuration
Here’s how we deploy Syncthing keys and configuration from encrypted data bags to the nginx nodes (Note: It probably makes sense to use run Syncthing as the same user as nginx, as Syncthing needs to deploy a key that should only be readable by nginx and noone else):
# Set this to the home directory of your user (probably the same user running nginx)
user = 'nginx'
# Populate node information from data bag
node_config = {}
node_list.each do |node_name|
config = Chef::EncryptedDataBagItem.load('syncthing', node_name, data_bag_secret)
node_config[node_name] = {}
node_config[node_name]['id'] = config['device_id']
# Set address to "dynamic" if it's ourselves
node_config[node_name]['address'] = if node.name == node_name
'dynamic'
else
"tcp://#{node_name}.#{node['domain']}:22000"
end
end
# Deploy Syncthing certificate (from data bag)
local_config = Chef::EncryptedDataBagItem.load('syncthing', node.name, data_bag_secret)
%w(key cert).each do |k|
# Show an error message if key couldn't be retrieved
Chef.fatal("#{k}.pem is empty!") unless local_config[k]
file "/home/#{user}/.config/syncthing/#{k}.pem" do
mode 0o600
owner user
group user
content local_config[k]
end
end
# Deploy Syncthing configuration
template "/home/#{user}/data/.config/syncthing/config.xml" do
mode 0o600
owner user
group user
source 'syncthing.config.xml.erb'
variables nodes: node_config, apikey: local_config['apikey']
end
# Restart Syncthing upon configuration/ key changes
service "syncthing@#{user}" do
subscribes :restart, "template[/home/#{user}/.config/syncthing/config.xml]"
subscribes :restart, "template[/home/#{user}/.config/syncthing/key.pem]"
subscribes :restart, "template[/home/#{user}/.config/syncthing/cert.pem]"
action [:enable, :start]
end
Restrict Syncthing to private backnet
We have a dedicated backnet for all environments. Syncthing should only be allowed on this specific backnet (in our case eth1
).
I’m using the iptables-ng cookbook to manage iptables.
# Allow Syncthing in backnet only
iptables_ng_rule '50-syncthing' do
rule ['-i eth1 --protocol tcp --dport 22000 --match state --state NEW --jump ACCEPT',
'-i eth1 --protocol udp --dport 21025 --match state --state NEW --jump ACCEPT']
end
Get the certificates and automate renewal
To actually request the certificate, the acme cookbook got you covered, which uses the ruby ACME library acme-client under the hood.
# Get some bonus points for generating your own Diffie-Hellmann parameters:
execute 'openssl dhparam -out /etc/nginx/certs/dhparam.pem 2048' do
creates '/etc/nginx/certs/dhparam.pem'
notifies :restart, 'service[nginx]'
end
# Make sure acme-client gem is installed
include_recipe 'letsencrypt::default'
# Create a webroot for acme challenges
directory '/etc/nginx/certs/acme' do
owner user
group user
end
# Deploy nginx site to answer ACME challenges
template '/etc/nginx/conf.d/letsencrypt.example.com.conf' do
mode 0o644
source 'letsencrypt.nginx.erb'
notifies :restart, 'service[nginx]', :immediately
not_if 'test -f /etc/nginx/certs/www.example.com.crt'
end
letsencrypt_certificate 'www.example.com' do
alt_names %w(example.com)
owner user
group user
fullchain '/etc/nginx/certs/www.example.com.crt'
key '/etc/nginx/certs/www.example.com.key'
method 'http'
wwwroot '/etc/nginx/certs/acme'
notifies :restart, 'service[nginx]'
end
# Remove temporary letsencrypt site
file '/etc/nginx/conf.d/letsencrypt.example.com.conf' do
notifies :restart, 'service[nginx]', :immediately
action :delete
end
The temporary letsencrypt.nginx.erb
server {
# This is for HAproxy with proxy_protocol, adapt if necessary
listen [::]:80 ipv6only=off proxy_protocol;
# Serve well-known path for letsencrypt
location /.well-known/acme-challenge {
root /etc/nginx/certs/acme;
default_type text/plain;
}
}
Also make sure to include something like this to your actual nginx site configuration, so challenges of automatic renewals can be answered:
server {
# This is for HAproxy with proxy_protocol, adapt if necessary
listen [::]:80 ipv6only=off proxy_protocol;
# Use remote_addr from proxy_protocol
real_ip_header proxy_protocol;
set_real_ip_from 10.13.37.0/24;
# Serve well-known path for letsencrypt
location /.well-known/acme-challenge {
root /etc/nginx/certs/acme;
default_type text/plain;
}
location / {
return 301 https://<%= @domain %>$request_uri;
}
}
server {
# This is for HAproxy with proxy_protocol, adapt if necessary
listen [::]:443 ssl http2 ipv6only=off proxy_protocol;
[...]
}
Wrap up
That’s it! We can now automatically request and renew free Let’s encrypt SSL certificates in our high availability setup! Syncthing will happily keep the certificates and challenges in sync, even if some nodes go down. More nodes can be added by simply adding the credentials to the syncthing
data bag, and the configuration of all nodes will adapt automatically.
If you have some feedback, feel free to contact me. I’m also available for hire as a freelancer.