Roots Discourse

Let's Encrypt issue when adding new domain to multisite

I have a multisite setup. The point of this website means that it involves adding new domain names on a regular basis. I do this by adding the new domain name, for example domain2.com, to the wordpress_sites.yml file like this:

site_hosts:
  - domain1.com
  - domain2.com

I use Let’s Encrypt for SSL certificates, which worked like a charm for the initial domains – but not for the domains I added later. At first I got an error when running ansible-playbook server.yml -e env=production that the DNS record should point to the webserver (which it already did), but I solved that by setting

ssl:
  enabled: false

Then running ansible-playbook server.yml -e env=production, change it back to

ssl:
  enabled: true

And then running ansible-playbook server.yml -e env=production again. I think the DNS Let’s Encrypt error stopped Nginx from updating the vhost. This way Let’s Encrypt had no way to reach the webserver to verify it.

After that verification worked like a charm and I got all greens when running the server playbook.

The problem is that Chrome returns a red lock saying the certificate is not valid for domain2.com. Yet is still is valid for domain1.com. Is this because there was already a certificate issued for the main domain? If so, how do I go about fixing this?

Do I need to provide more information or am I being to impatient :slight_smile:

I’m having this exact issue.

I’ve got the following in my site_hosts, and like you resolved the error about all the domains DNS records not pointing at the server. But now still get warnings in Chrome for everything but the first domain

site_hosts:
- domain1.com
- subdomain.domain1.com
- domain2.com

Error in chrome is ERR_CERT_COMMON_NAME_INVALID

You [both] might get some use out of the troubleshooting steps here.

Thanks for the link, I tried

cd /var/lib/letsencrypt && sudo ./renew-certs.py

And get…

Certificate file /etc/nginx/ssl/letsencrypt/domain1.cert already exists
  The certificate is younger than 60 days. Not creating a new certificate.

So something must be up with the configuration just not recognising the other two domains and creating certificates for them.

Which is odd because I was having issues with the DNS yesterday where it was asking to double check that correct A Records all existed – and they do now – so some part of the server provisioning understands that there’s three domains but something else doesn’t recognise them.

There are other troubleshooting steps in that thread, did you try any?

Yeah but that was the only one that yielded a noteworthy result…

And the OP in that thread resolved the issue because the DNS was incorrectly configured, which I’m almost certainly sure isn’t the case here since I can correctly ping the root and www. versions of all three domains in my site_hosts.

“Solved” this by wiping the development server and starting over. Not ideal, but it worked, and the site was only new so nothing lost.

There must have been some config file or something to do with Let’s Encrypt being left on the server and not being removed or overwritten once site_hosts was updated.

Is there a way to forcibly remove all existing traces of Let’s Encrypt from the server and have the certificates remade?

Just got to this part of my multisite journey and was able to figure it out without a complete reprovision. Here’s what I did:

  1. SSH to remote.
  2. $ sudo rm -rf /var/lib/letsencrypt /usr/local/letsencrypt /srv/www/letsencrypt /etc/nginx/ssl/letsencrypt /etc/ssl/certs/lets-encrypt-x3-cross-signed.pem which should remove all remnants of the existing certificates.
  3. On local machine in your Trellis project dir run $ ansible-playbook server.yml -e environment=<YOUR_ENV_NAME> --tags "letsencrypt" which should generate new certificates.
  4. It wasn’t necessary on my setup, but power cycling your remote may be necessary in some cases: $ sudo shutdown -r now.

After that my main domain and subdomain were all super green :thumbsup:

10 Likes

That totally looks like it would work.

I wonder if a future version of Trellis might be smart enough to detect if there’s new site_hosts and remove all existing letsencrypt data… but that’s above my pay grade.

I ended up taking a lot of notes during my multisite install, was thinking of writing them up somewhere since the official docs are a bit light on … and I came across a few of the same errors over and over.

Yes, my journey through multisite has given me the same ideas.

It could just as easily be turned into its own playbook that you just run when needed. new_ssl.yml or something similar.

Also, @Simeon, feel free to contribute to the docs :thumbsup:

Btw so far I only tried this with subdomains. I may try it with subfolders and domain mapping shortly, just to be adventurous. I should also mention that I’m using a mu-plugin for Bedrock from a pending Github pull request to fix main site URLs which eliminates the need for db URL tweaking and @darjanpanic’s fix here. That fix does the trick but ends up removing /wp/ from the admin paths and that creeps me out.

3 Likes

This worked. I didn’t need step 4.

If anyone struggles with this issue, the easiest way to work around that is using certbot - https://certbot.eff.org/

I had issues where Ansible hangs on “Test ACME challenges”, and then if I comment that out of the letsencrypt role, it failed on “Generate certificates”. Turning ssl off and on and re-provisioning will take your https version offline.

Certbot has nginx plugin and generating new certficates is as easy as sudo certbot --nginx and following instructions, no downtime, takes like 30 seconds.

1 Like

Yes, it would be great if Trellis could switch to certbox nginx plugin now.

I am getting the following error while installing Let’s Encrypt certificate on my domain:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Obtaining a new certificate
Performing the following challenges:
http-01 challenge for linuxbuz.com
http-01 challenge for www.linuxbuz.com
Using the webroot path /var/www/html for all unmatched domains.
Waiting for verification…
Cleaning up challenges
Failed authorization procedure. linuxbuz.com (http-01): urn:acme:error:unauthorized :: The client lacks sufficient authorization :: Invalid response from http://linuxbuz.com/.well-known/acme-challenge/pYpAC6kT25C0itcTNKd8hwb_0VaoPxJVIkVg5_xn-N4 [77.111.240.95]: 403