Every team has a certificate renewal story that ends with a 2am page and a scramble through a wiki page last updated in 2019. The process sounds simple until you're managing certificates across three cloud providers, two CAs, and a Kubernetes cluster that somebody set up before they left the company. Certificate renewal at scale isn't a single operation. It's a category of operations, each with its own failure modes, and the industry is about to make all of them more frequent.
This guide covers what actually happens during renewal, how to automate it, and what breaks when you're responsible for more than a handful of certs. If you manage fewer than ten certificates, the vendor docs will serve you fine. If you manage fifty or more, keep reading.
What certificate renewal actually involves
Certificate renewal replaces an expiring TLS certificate with a new one for the same identity, but the mechanics vary significantly depending on the CA, cert type, and whether you reuse keys. Industry data indicates that roughly 60% of certificate-related outages trace back to renewal process failures, not initial provisioning. Understanding the distinction between renewal, reissuance, and rekeying prevents the confusion that leads to those outages.
TLS/SSL certificate renewal vs. reissuance
Renewal, reissuance, and rekeying are three distinct operations, though most CAs use the terms loosely:
- SSL certificate renewal extends coverage with a new certificate and new validity period. The CA may or may not require a new CSR.
- Certificate reissuance generates a new certificate mid-term, typically because you need to change SANs or your key was compromised.
- Certificate rekeying specifically means generating a new key pair and getting a cert issued against it.
The practical difference matters when you're automating. If your pipeline assumes renewal never changes the key, you'll break certificate pinning configurations. If it assumes the SANs stay identical, you'll miss cases where a reissuance added a subdomain that your monitoring doesn't cover.
Certificate types and their renewal workflows
DV, OV, and EV certificates each follow different renewal workflows due to their validation requirements:
| Certificate type | Automation level | Validation required | Typical renewal time |
|---|---|---|---|
| DV certs | Fully automatable via ACME | Domain control only (HTTP-01, DNS-01, or email) | Minutes |
| OV/EV certs | Partially automatable | Organization validation with human review (typically annual) | Hours to days |
| Internal/mTLS certs | Fully automatable with your own CA | Controlled by your step-ca or Active Directory CA policy | Minutes |
DV certificates renew with domain validation only, which is why ACME automates them end to end. OV and EV certificates require organization validation steps involving human review, making full automation impossible. Internal PKI and mTLS certificates follow whatever policy your CA enforces — cert-manager or step-ca can automate these, but you own the root of trust and the rotation logic.
Why certificates expire (and why 90-day lifetimes are winning)
Certificates expire because revocation doesn't work reliably enough to be the only safety net. CRL distribution is slow, OCSP has availability problems, and according to Netcraft's measurements, OCSP stapling fails silently in roughly 8% of configurations. Short certificate lifetimes reduce the window during which a compromised key remains trusted. This isn't theoretical — it's the actual security model the industry has converged on.
The security case for short-lived certificates
Let's Encrypt set the standard at 90 days in 2015 and proved that short-lived certificates work at internet scale, now protecting over 360 million domains. The security logic is straightforward:
- A 90-day certificate compromised on day one gives an attacker at most 90 days of exposure
- A one-year certificate compromised on day one gives an attacker up to 365 days of exposure
- Revocation mechanisms (CRL, OCSP) frequently fail to close that window in practice
If a key is compromised and revocation fails — which it often does — the exposure window is bounded only by the certificate's remaining validity period.
CA/Browser Forum changes and what's coming
The CA/Browser Forum passed ballot SC-081 in 2025, setting a phased reduction in maximum TLS certificate validity:
| Effective date | Maximum certificate validity |
|---|---|
| Before March 2026 | 398 days |
| March 2026 | 200 days |
| March 2027 | 100 days |
| March 2029 | 47 days |
The operational impact is significant. If you're renewing certificates manually today, you're doing it once a year per cert. By 2029, you'll be doing it roughly eight times per year per cert. For a fleet of 200 certificates, that's 1,600 renewal events annually. The math makes the case for automated certificate renewal better than any blog post can.
Manual certificate renewal: step by step
Manual TLS certificate renewal follows four steps: generate a CSR, submit it to your CA with validation, install the new cert, and verify the chain. The entire process takes 15–60 minutes per certificate depending on the validation type. Multiply that by your cert count to understand why this section exists mainly so you know what to automate.
Step 1: Generate a CSR
openssl req -new -newkey rsa:2048 -nodes \
-keyout example.com.key \
-out example.com.csr \
-subj "/CN=example.com/O=Your Org/L=City/ST=State/C=US"
If you're renewing with the same key (not recommended, but sometimes required by policy), drop -newkey rsa:2048 and use -key existing.key instead. Key reuse saves you from updating pinning configs but extends the exposure window if that key was ever compromised.
Step 2: Submit to your CA and validate
Upload the CSR to your CA's portal or API. Validation methods differ by cert type:
- HTTP-01: Place a file on your webserver at a CA-specified path
- DNS-01: Create a TXT record in your domain's DNS
- Email: Respond to a verification email sent to a domain admin address
- OV/EV: All of the above plus phone verification and document review
Step 3: Install the renewed certificate
The installation step is where most manual renewals fail. The cert file alone isn't enough — you need the full chain in the correct order.
# Combine cert and chain for Nginx
cat example.com.crt intermediate.crt > fullchain.pem
# Reload Nginx without downtime
nginx -t && systemctl reload nginx
For AWS ALB, upload via the CLI: aws acm import-certificate. For Kubernetes Ingress, update the TLS secret. The critical gotcha: forgetting to restart or reload the service after installing the new cert. After monitoring thousands of renewal events, I've seen teams update the file on disk and close the ticket, only to get paged when the old cert still in memory expires.
Step 4: Verify the chain and test
# Check cert dates and chain
openssl s_client -connect example.com:443 -servername example.com </dev/null 2>/dev/null | openssl x509 -noout -dates -issuer
# Verify the full chain
openssl verify -CAfile ca-bundle.crt fullchain.pem
Test from outside your network. CDNs and load balancers cache certificates, and a successful local test doesn't mean your edge nodes picked up the change.
Automated certificate renewal with ACME
The ACME protocol (RFC 8555) automates the entire certificate lifecycle: key generation, domain validation, certificate issuance, and installation. According to Let's Encrypt's published data, over 300 million certificates are currently managed via ACME through Let's Encrypt alone. If you're still renewing DV certs manually, this section is your exit ramp.
How the ACME protocol works
ACME is a challenge-response protocol that automates certificate issuance in four steps:
- Client contacts the CA and requests a certificate for a specific domain
- CA issues a challenge (HTTP-01 or DNS-01) to prove domain control
- Client completes the challenge and notifies the CA
- CA validates and issues the signed certificate over HTTPS with JSON payloads
The client handles CSR generation internally, removing the manual step entirely.
Certbot, acme.sh, and alternatives
Choosing an ACME client depends on your environment:
| ACME client | Language | Best for | Key advantage |
|---|---|---|---|
| Certbot | Python | Traditional VM deployments | Reference client with Nginx/Apache plugins |
| acme.sh | Shell | Minimal or containerized environments | Zero dependencies, supports 70+ DNS providers |
| lego | Go | CI/CD pipelines | Single binary, easy to embed |
| step-ca | Go | Internal PKI | ACME for private certificates, not just public |
A working Certbot renewal with hooks:
certbot renew --deploy-hook "systemctl reload nginx" \
--pre-hook "echo 'Starting renewal' | logger" \
--post-hook "echo 'Renewal complete' | logger"
The certbot renew command checks all managed certs and renews those within 30 days of expiry. Add it to a daily cron and the process runs unattended.
DNS-01 vs HTTP-01 challenge tradeoffs
HTTP-01 is simpler but requires port 80 access on every server. DNS-01 works for wildcard certs and servers behind firewalls, but introduces DNS API dependencies. At scale, DNS-01 has specific pain points:
- Rate limits: Cloudflare limits API requests to 1,200 per 5 minutes
- Propagation delays: TXT record propagation can cause validation timeouts
- Credential sprawl: Managing DNS API credentials for multiple providers across environments adds complexity
For a deeper look at protocol mechanics, see our ACME protocol guide.
Certificate renewal in Kubernetes and cloud environments
cert-manager is the de facto standard for Kubernetes certificate renewal, running in over 40% of Kubernetes clusters according to CNCF survey data. It watches Certificate resources and renews at 2/3 of the certificate's lifetime by default. Cloud providers offer their own auto-renewal for managed certificates, but each has different behaviors and silent failure modes.
cert-manager for Kubernetes
cert-manager creates Certificate resources backed by Issuers (namespace-scoped) or ClusterIssuers (cluster-wide). When a cert reaches the renewal window, cert-manager automatically:
- Generates a new CSR
- Completes the ACME challenge
- Updates the Kubernetes Secret with the new certificate
The critical failure mode to watch for: if the Issuer's credentials expire or the DNS solver loses permissions, cert-manager logs errors but your certs silently age toward expiry. For the full setup, see our Kubernetes certificate renewal guide.
AWS ACM, GCP CAS, and Azure Key Vault auto-renewal
| Provider | Service | Auto-renewal | Failure notification | Covers |
|---|---|---|---|---|
| AWS | ACM | Yes, for DNS-validated certs | CloudWatch event on failure | ALB, CloudFront, API Gateway |
| GCP | Certificate Manager | Yes, for Google-managed certs | Cloud Monitoring alert | Load Balancers |
| Azure | Key Vault | Yes, configurable at 80% lifetime | Event Grid notification | App Gateway, Front Door |
The common trap: assuming "auto-renewal" means "never think about it." In practice, every provider has silent failure scenarios:
- AWS ACM auto-renewal fails silently if the CNAME validation record gets deleted
- Azure Key Vault won't renew if the cert policy doesn't match the issuer's requirements
- GCP Certificate Manager requires the domain authorization to remain valid
Every cloud provider's auto-renewal has at least one scenario where it fails without an obvious alert.
Service mesh and mTLS certificate rotation
Istio and Linkerd handle mTLS certificate rotation for workload identities automatically, but the root CA and intermediate certs still require manual rotation. Istio's default root cert expires after 10 years, which sounds like someone else's problem until you realize your cluster is four years old and nobody documented the rotation procedure. Workload certificate rotation happens automatically; trust anchor rotation is a manual, high-risk operation.
Certificate renewal at scale: what breaks after 50 certs
Managing certificate renewal across a fleet means tracking expiration dates, CA relationships, and deployment targets for every cert in your certificate inventory. In our experience managing enterprise certificate estates, the average mid-market company has 15–20% more certificates than they think they do, and at least one will be a wildcard cert that somebody provisioned through a personal account three years ago.
Tracking expiration across multiple CAs and environments
The spreadsheet approach breaks down around 50 certificates. Beyond that threshold, you need programmatic discovery:
- Prometheus blackbox exporter probes endpoints and exports
probe_ssl_earliest_cert_expiryas a metric - Certificate Transparency logs via crt.sh provide a view of publicly issued certs for your domains
- Network scanning catches certs on servers not exposed to external monitoring
- CA API integration pulls renewal status directly from each certificate authority
Neither CT logs nor endpoint probing catches internal certs or certs sitting on servers that aren't exposed to your monitoring. For certificate monitoring that actually covers your full estate, you need a combination of all four approaches. This is the operational problem that motivated us to build CertPulse: the gap between "we have monitoring" and "we know about every cert."
Renewal failures you won't catch without monitoring
After monitoring certificate renewals across thousands of environments, these are the most common silent failure patterns:
| Failure type | What happens | Why it's hard to detect |
|---|---|---|
| CDN cache masking | CDN serves cached cert after origin renewal fails | Everything looks fine until the CDN cache expires and clients see the expired cert |
| Intermediate chain rot | CA rotates intermediates; server still serves the old one | Android clients break first because they don't fetch intermediates automatically |
| Orphaned non-ACME certs | 95% of certs auto-renew via Certbot; the five OV certs from a vendor portal three years ago do not | They're not in your automation inventory |
| DNS permission drift | ACME DNS-01 validation fails because someone tightened IAM policies | Renewal service silently lost write access to Route 53 |
| Silent cert-manager failures | cert-manager logs renewal failed but no alert fires |
Nobody configured alerting on CertificateRequest denied events |
Building a renewal runbook
Your renewal runbook should answer three questions for every certificate in your fleet:
- What's expiring? — Certificate identity, SANs, and expiration date
- Who owns it? — Team, individual, and escalation path
- What's the renewal method? — ACME automated, cloud managed, or manual with specific CA
Keep the runbook next to your incident response docs, not buried in a wiki. Include rollback procedures for the scenario where a renewed cert breaks clients.
Certificate renewal checklist
| Step | Manual | ACME automated | Cloud managed |
|---|---|---|---|
| Pre-renewal | |||
| Inventory cert and confirm owner | Yes | Verify automation config | Verify auto-renewal enabled |
| Decide: new key or reuse | Yes | Client decides (default: new) | Provider decides |
| Check SAN list is current | Yes | Review Certbot config | Review ACM/Key Vault settings |
| During renewal | |||
| Generate CSR | openssl req |
Automatic | Automatic |
| Complete validation | Manual DNS/HTTP/email | Automatic challenge | Automatic (if CNAME intact) |
| Install cert + full chain | Manual copy + reload | Deploy hook | Automatic propagation |
| Post-renewal | |||
| Verify chain externally | openssl s_client |
Monitoring check | Endpoint probe |
| Confirm monitoring picks up new expiry | Update tracking | Auto-detected | CloudWatch/Event Grid |
| Document what changed | Update runbook | Commit config changes | Tag resource |
FAQ
How far in advance should I renew a certificate? Start renewal 30 days before expiry for manual renewals to leave room for validation delays and troubleshooting. Certbot defaults to renewing at 30 days remaining. cert-manager renews at 2/3 of the total lifetime — for 90-day certs, that means renewal happens around day 60.
Does certificate renewal generate a new private key? It depends on your configuration. Certbot generates a new key by default on each renewal. Some CAs allow key reuse during renewal. Generating a new key is generally recommended because it limits the impact window if the previous key was compromised without your knowledge.
Will my site go down during certificate renewal? No, not if you reload rather than restart your web server. Both Nginx and Apache support graceful reloads that swap the certificate without dropping active connections. The risk is in the gap between installing the cert and reloading the service — automate both steps together to eliminate it.
What happens if a certificate renewal fails silently?
The old certificate continues serving until it expires, then clients see ERR_CERT_DATE_INVALID or equivalent errors. If a CDN sits in front of your origin, the CDN's cached cert may mask the failure for hours or days. This is why external certificate expiration monitoring matters more than checking your ACME client's logs.
How do I handle certificate renewal for hundreds of certificates across multiple CAs? You need three things: a complete inventory (discovered, not just documented), automated renewal for everything that supports it, and monitoring that alerts on expiry regardless of the renewal method. The ssl certificate management challenge isn't any single renewal — it's knowing that every renewal across your fleet actually succeeded.
This is why we built CertPulse
CertPulse connects to your AWS, Azure, and GCP accounts, enumerates every certificate, monitors your external endpoints, and watches Certificate Transparency logs. One dashboard for every cert. Alerts when auto-renewal fails. Alerts when certs approach expiry. Alerts when someone issues a cert for your domain that you didn't request.
If you're looking for complete certificate visibility without maintaining scripts, we can get you there in about 5 minutes.