Let's Encrypt Renew Failing, Looking for non-existent Cert hash

So this server has the following certificates on it:

-rw-r--r-- 1 root root 3925 Aug 11 02:55 example.com-2dc63c4-bundled.cert
-rw-r--r-- 1 root root 3900 Nov  9 22:19 example.com-4628d75-bundled.cert
-rw-r--r-- 1 root root 3900 Nov  9 22:19 example.com-bundled.cert
-rw-r--r-- 1 root root 3900 Jun 30  2020 example.com-c9bbd7f-bundled.cert
-rw------- 1 root root 3243 Feb 27  2020 example.com.key

However renew-certs.py is looking for:

./renew-certs.py:letsencrypt_cert_ids = {'example.com': '3224635'}

Which, of course, doesn’t exist.

Is it safe, as the Nov 9 certs are probably still valid, to just manually update the hash in renew-certs.py to match the cert?

I think I’m using Trellis v1.5.0, at least that’s the last version number in the Changelog, below HEAD.

Thanks, guys.

You said “renew failing”, what actually happens?

A cert hash not existing is completely normal and the renew script just creates it if it doesn’t.

Also certs last for 3 months so you have another 5 days to get it sorted

Thanks, Scott. Not sure what exactly you mean by “what happens”? But when I run .renew-certs.py manually, it runs on python2 and returns:

The required CSR file /var/lib/letsencrypt/csrs/example.com-3224635.csr does not exist. This could happen if you changed site_hosts and have not yet rerun the letsencrypt role. Create the CSR file by re-provisioning (running the Trellis server.yml playbook) with `--tags letsencrypt`

When I run trellis provision --tags letsencrypt production:

TASK [letsencrypt : Notify of challenge failures] ******************************
System info:
  Ansible 2.10.1; Darwin
  Trellis version (per changelog): "Allow WP cron intervals to be configurable"
---------------------------------------------------
Could not access the challenge file for the hosts/domains: www.example.com.
Let's Encrypt requires every domain/host be publicly accessible. Make sure
that a valid DNS record exists for www.example.com and that they point to
this server's IP. If you don't want these domains in your SSL certificate,
then remove them from `site_hosts`. See https://roots.io/trellis/docs/ssl for
more details.
failed: [142.93.etc...

The server is configured to run over non-www. Does the www in the output have relevance to that?

The server is configured to run over non-www . Does the www in the output have relevance to that?

Yes, because Trellis automatically redirects www.host.tld to host.tld so you still need DNS records for every domain/host.

So to recap the issue:

  1. something caused the hashes to change (new site host is the most frequent cause of this)
  2. renew script won’t work without reprovisioning with the lets encrypt role (as the error says, which you did)
  3. the role fails because it can’t access the challenge file (likely due to a DNS issue) which means it can’t create the CSR

Hopefully this is just a DNS issue and you can easily add the record. Your original idea of manually changing the hash could work to get the cert renewed; it shouldn’t break anything at least. But you should really fix the bigger issue regardless.

So are you saying that the issue could well be:

ping www.example.com
ping: cannot resolve www.example.com: Unknown host

And I would just need to add a DNS record for the www subdomain and then run the letsencrypt task?

Yep exactly, correct

I was able to remove the www redirects paremeter from group_vars/env/wordpress_sites.yml and renew the cert for JUST non-www, but then, of course, requests to www failed:

canonical: example.com
        redirects: # removed
          - www.example.com # removed

Once we got the DNS record for www added, was able to generate the cert for both.

Thanks a ton, as always, Scott.

This topic was automatically closed after 42 days. New replies are no longer allowed.