Roots Discourse

Could not access the challenge file for the hosts/domains: www.example.com

I see that this is not an undocumented probelm (here, here) and have tried:’

  1. Set SSL to false in group_vars/env/wordpress-sites.yml
  2. trellis provision --tags wordpress production
  3. Set SSL to true in group_vars/env/wordpress-sites.yml
  4. trellis provision --tags letsencrypt production
  5. Manually edit /etc/nginx/sites-available/example.com.conf:

From:

109   location / {
110     return 301 http://example.com$request_uri;
111   }

to:

109   location / {
110     return 301 http://$host$request_uri;
111   }

Run: sudo service nginx reload

Additionally tried cycling nginx with

109   location / {
110     return 301 http://www.example.com$request_uri;
111   }

Still same error.

Does this look like correct location and content for acme-challenge-location.conf

$ /etc/nginx/acme-challenge-location.conf
location ^~ /.well-known/acme-challenge/ {
  alias /srv/www/letsencrypt/;
  try_files $uri =404;
}

In troubleshooting I also deleted some files, emptying /srv/www/letsencrypt/ and /var/lib/letsencrypt/csrs/.

What’s the scenario here? Is this a new server with a new domain? Some more background details would be helpful.

edit: I just realized we tag the wordpress role with letsencrypt :sweat_smile:
One important thing is that you can’t just run the wordpress or letsencrypt tags when you toggle those values. I think just wordpress is fine if you turn SSL off, but you’d definitely want to run letsencrypt,wordpress when you toggle it back on.

It’s an older server. Ubuntu 18. I had, maybe partially updated the Trellis codebase. Renewal errored out because the letsencrypt emails hadn’t been set.

I also updated /roles/fail2ban/defaults/main.yml and

Had to change roles/wordpress-setup/tasks/nginx.yml to state: "{{ item.enabled | default(true) | ternary('link', 'hard') }}" (from absenttohard`)

And yes, get the error now when running the wordpress tasks with SSL set to true.

Thanks much, Scott. What would I do without you?

Ahah. On my local computer:

curl http://example.com/.well-known/acme-challenge/ping.txt -w "%{http_code}"
200%  

And

curl http://www.example.com/.well-known/acme-challenge/ping.txt -w "%{http_code}"
200%    

On the server of from another server:

curl http://www.example.com/.well-known/acme-challenge/ping.txt -w "%{http_code}"
curl: (6) Could not resolve host: www.example.com

No reference to the DNS in local /etc/hosts file.

I’m not sure what that means or how to fix it.

Looks like this is at issue:

 /etc/nginx/ssl/letsencrypt/example.com.key: No such file or directory

Also check whether there is an IPv6 AAAA record for your domain, Let’s Encrypt prefers those over the IPv4 A records for HTTP-01 validation. Verify that the server is correctly listening on IPv6 address.

Thanks. I’m not sure how to do that but will look into it. When I reprovision (wordpress tasks) without SSL, curl on the http address returns 301 permanent redirect.

curl http://example.com -w "%{http_code}"
301

When I check AAAA record with https://mxtoolbox.com it returns

Test	                Result	
DNS Record Published	DNS Record not found

Looks like nginx is listening:

netstat -tlnp | grep nginx
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      957/nginx: master p 
tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      957/nginx: master p 
tcp6       0      0 :::80                   :::*                    LISTEN      957/nginx: master p 
tcp6       0      0 :::443                  :::*                    LISTEN      957/nginx: master p 

If there’s no key file, then yeah something has likely gone wrong. Can you just re-create the server from scratch and try again?

You mean with a new droplet/IP address?

Yeah, unfortunately I think if you rebuild a droplet the IP will change (unless you’re using a floating IP).

It would be great to avoid having to change the IP. There’s a bit of bureaucracy between us and the registrar.

To avoid that, I would delete any files in /etc/nginx/ssl/letsencrypt/ and /srv/www/letsencrypt, and any *.csr files in /var/lib/letsencrypt.

Then provision again (without any tags, so everything).

I’m running that now with -vvv and fingers crossed.

Same Error

Could not access the challenge file for the hosts/domains: www.example.com

Should the challenge file be trying to load over www?

    site_hosts:
      -
        canonical: example.com
        redirects:
          - www.example.com

Will probably provision a new server if this fails. For Ubuntu 20, should I use the master branch of Trellis? Maybe I should just stick with Ubuntu 18 for now.

I was able to restore the /etc/nginx/ssl/letsencrypt/example.com.key from a previous Snapshot.

Now getting a 200 on curl http://example.com/.well-known/acme-challenge/ping.txt -w "%{http_code}".

Going to try cycling ssl: false, ssl: true again.

Now getting:

non-zero return code
The required CSR file /var/lib/letsencrypt/csrs/phytrehab.com-3224635.csr
does not exist. This could happen if you changed site_hosts and have not yet
rerun the letsencrypt role. Create the CSR file by re-provisioning (running
the Trellis server.yml playbook) with `--tags letsencrypt`

The run:

ansible-playbook server.yml -e env=production --tags letsencrypt -vvv     

And back to this again:

Could not access the challenge file for the hosts/domains: www.example.com

Running curl without www succeeds, with www it fails (curl: (6) Could not resolve host).

When I run the wordpress tasks with ssl set to false, the browsers are still trying to load the site over https. Is that to be expected?

Interesting. Removed the www redirect from wordpress-sites.yml and got a lot further. on the letsencrypt tasks.

Failed at `non-zero return code
nginx: [emerg] "resolver" directive is duplicate in
/etc/nginx/h5bp/directive-only/ssl-stapling.conf:37
nginx: configuration file /etc/nginx/nginx.conf test failed.

Running it a second time seemed to succeed.

Then success with:

ansible-playbook server.yml -e env=production --tags wordpress -vvv

Sites not loading on front end, though. Trying to run letsencrypt tasks again with redirect reinstated.

Content of ssl-stapling.conf:

 23 ssl_stapling on;
 24 ssl_stapling_verify on;
 25 
 26 resolver
 27   # (1)
 28   1.1.1.1 1.0.0.1 [2606:4700:4700::1111] [2606:4700:4700::1001]
 29   # (2)
 30   8.8.8.8 8.8.4.4 [2001:4860:4860::8888] [2001:4860:4860::8844]
 31   # (3)
 32   # 216.146.35.35 216.146.36.36
 33   valid=60s;
 34 #trusted cert must be made up of your intermediate certificate followed by root certificate
 35 #ssl_trusted_certificate /path/to/ca.crt;
 36 
 37 resolver 8.8.8.8 8.8.4.4 216.146.35.35 216.146.36.36 valid=60s;
 38 resolver_timeout 2s;

Commenting out line 37 seems to have solved that issue.

So for others with same issue (or next time I have it), I guess I would say:

  1. Don’t delete /etc/nginx/ssl/letsencrypt/example.com.key!
  2. DO Server backups may well be worth paying for (I had a snapshot by luck)
  3. Set ssl to false, running wordpress tagged tasks, and possibly just letsencrypt tasks
  4. Now set ssl back to true
  5. Removing redirects from site hosts may also help, particularly if it’s one of the redirects that is coming up in the error output.

Thanks for the time and input @swalkinshaw and @strarsis.

1 Like

@swalkinshaw Where on the server do the letsencrypt_contact_emails end up? I see that they are referenced by the python script that runs the renewal.

Be wary when setting the letsencrypt_contact_emails variable:

1 Like

I’m having trouble finding that file on the server. Do you know where it gets generated?

The GItHub search indeed seems to have issues finding some files,
it is in the Trellis repository however, as a template: