
Table of Contents
The Day an SSL Certificate Expired in Production
Every DevOps engineer eventually has that night. It usually starts quietly. A monitoring alert pops up. Maybe it’s a Slack notification. Maybe an uptime check failed.
At first it doesn’t look serious. Then another alert fires. And another. Someone opens the production website in a browser and sees the dreaded warning:
Your connection is not private.
The SSL certificate has expired. Suddenly things move fast. Slack channels fill with messages. Engineers jump into emergency calls. Someone logs into the load balancer. Someone else checks the certificate store.
The fix itself takes about five minutes:
- Renew the certificate
- Restart the web server
- Verify HTTPS works again
But the stress and disruption can last an hour or more.
The real lesson from incidents like this isn’t about SSL itself. It’s about automation. Manual certificate management might work for a few domains. But once you operate production infrastructure, it becomes fragile. That’s where the strategy to automate SSL renewal with Route53, Ansible, and Certbot comes in.
Why Manual SSL Management Breaks at Scale
Years ago, certificates lasted two or three years. Renewals were rare and easy to track. Today, the landscape has changed. Most public TLS certificates now have shorter lifetimes, sometimes under 200 days. This change improves security, but it increases operational overhead.
In modern infrastructure you may have:
- Dozens of services
- Multiple environments (dev, staging, production)
- Load balancers
- Container platforms
- Internal APIs
Suddenly you’re not managing one certificate.
You’re managing dozens or hundreds. Without automation, sooner or later one will expire.
The Solution: Automate SSL Renewal with Route53, Ansible, and Certbot
A reliable certificate automation system needs to handle three things:
- Issuing certificates
- Renewing them automatically
- Distributing them safely across servers
A practical and proven stack looks like this:
- Certbot – handles certificate issuance and renewal
- AWS Route53 – performs DNS validation (this can be apply for other DNS provider Certbot Plugin Page)
- Ansible – distributes certificates to production servers
This setup is simple, reliable, and widely used in real DevOps environments.
Architecture Overview
The workflow is straightforward once you see the pieces.
Crontab
│
│ Crontab monthly execution
▼
Certbot
│
│ DNS Validation
▼
Route53
│
▼
Let's Encrypt
│
▼
Certificate Issued
│
▼
Certificate Server
│
│ Deploy Hook
▼
Ansible Playbook
│
▼
Production Servers
The important idea here is centralized certificate management.
Instead of generating certificates on every server, one machine handles:
- issuing certificates
- renewing them
- distributing them via Ansible
This drastically reduces operational complexity.
1- Preparing Route53 for DNS Validation
To automate SSL renewal with Route53, Ansible, and Certbot, Certbot needs permission to modify DNS records.
DNS validation works by creating a temporary TXT record that proves domain ownership.
First, create an AWS IAM policy allowing Route53 updates.
Simple Example policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"route53:ListHostedZones",
"route53:GetChange",
"route53:ChangeResourceRecordSets"
],
"Resource": "*"
}
]
}
Advanced Example policy:
{
"Version": "2012-10-17",
"Id": "certbot-dns-route53 sample policy",
"Statement": [
{
"Effect": "Allow",
"Action": [
"route53:ListHostedZones",
"route53:GetChange"
],
"Resource": [
"*"
]
},
{
"Effect" : "Allow",
"Action" : [
"route53:ChangeResourceRecordSets"
],
"Resource" : [
"arn:aws:route53:::hostedzone/YOURHOSTEDZONEID"
]
}
]
}
Next, configure AWS credentials on the certificate server:
~/.aws/credentials
Example:
[default]
aws_access_key_id=ACCESS_KEY
aws_secret_access_key=SECRET_KEY
Always secure this file:
chmod 600 ~/.aws/credentials
2 – Installing Certbot with Route53 Support
Certbot supports DNS validation through plugins.
Install the Route53 plugin:
pip3 install certbot-dns-route53
Verify the plugin:
certbot plugins
You should see:
dns-route53
This plugin allows Certbot to automatically create and remove DNS validation records.
3 – Issuing Certificates Automatically
Now issue your first certificate.
Example command:
certbot certonly \
--dns-route53 \
-d example.com \
-d "*.example.com"
Here’s what happens behind the scenes:
- Certbot requests a certificate from Let’s Encrypt.
- A DNS TXT record is created in Route53.
- Let’s Encrypt verifies the record.
- The certificate is issued.
Certificates are stored here:
/etc/letsencrypt/live/example.com/
Key files include:
fullchain.pemprivkey.pem
These are the files web servers use for TLS.
4 – Setting Up Automated Renewals
Certbot automatically checks certificate expiration.
Add a cron job:
0 3 * * * certbot renew --quiet
This runs every night at 3 AM.
Certbot will renew certificates only when necessary, usually 30 days before expiration.
But renewal alone isn’t enough. Servers still need the updated certificate.
5 – Distributing Certificates Using Ansible
This is where Ansible becomes extremely useful.
When a certificate renews, we trigger an Ansible playbook.
Example deploy hook:
certbot renew --deploy-hook "/usr/local/bin/deploy-certs.sh"
Example script:
#!/bin/bash
ansible-playbook deploy-certs.yml
Example Ansible playbook:
- hosts: webservers
become: yes
tasks:
- name: Copy certificate
copy:
src: /etc/letsencrypt/live/example.com/fullchain.pem
dest: /etc/nginx/ssl/fullchain.pem
- name: Copy private key
copy:
src: /etc/letsencrypt/live/example.com/privkey.pem
dest: /etc/nginx/ssl/privkey.pem
- name: Reload nginx
service:
name: nginx
state: reloaded
Now every server receives the new certificate automatically.
6 – Building a Fully Automated Renewal Pipeline
Once everything is connected, the entire process becomes automatic.
Cron Job
│
▼
Certbot Renewal Check
│
▼
Certificate Renewed
│
▼
Deploy Hook Triggered
│
▼
Ansible Distribution
│
▼
Servers Reload TLS
From that point on, certificates renew quietly in the background.
- No alerts.
- No midnight incidents.
- No manual work.
Lessons Learned from Production
After implementing certificate automation across several environments, a few lessons stand out.
- Centralize certificate issuance
- One certificate management host simplifies everything.
- Always test with Let’s Encrypt staging
- Avoid rate limits while testing automation.
- Monitor expiration anyway
- Automation reduces risk but monitoring provides safety.
- Protect private keys carefully
- Permissions should always be restricted.
Summary
SSL outages are rarely caused by complex failures. Most of the time, they happen because a certificate simply expired.
Automation removes that risk.
By implementing a system to automate SSL renewal with Route53, Ansible, and Certbot, teams gain:
- Reliable certificate renewals
- Consistent deployment across servers
- Fewer production incidents
Most importantly, engineers no longer need to wake up in the middle of the night to fix something that should have been automated.
Set it up once, test it properly, and let the system handle the rest.