Trellis provisioning with letsencrypt fails at nginx reload step

erit · January 24, 2023, 9:17am

This same trellis server worked on production with letsencrypt, but now trying to do a staging server.

The staging subdomain IP has propagated now for several days and passes as green on whatsmydns.net for all but china and some of the more difficult countries.

There are no cert challenge failures.

The error in ansible is:

RUNNING HANDLER [common : reload nginx] ****************************************
fatal: [staging_host]: FAILED! => {"changed": true, "cmd": ["nginx", "-t"], "delta": "0:00:00.009640", "end": "2023-01-24 08:49:48.971955", "msg": "non-zero return code", "rc": 1, "start": "2023-01-24 08:49:48.962315", "stderr": "nginx: [emerg] open() \"/etc/nginx/fastcgi_params\" failed (2: No such file or directory) in /etc/nginx/sites-enabled/website.com.conf:122\nnginx: configuration file /etc/nginx/nginx.conf test failed", "stderr_lines": ["nginx: [emerg] open() \"/etc/nginx/fastcgi_params\" failed (2: No such file or directory) in /etc/nginx/sites-enabled/website.com.conf:122", "nginx: configuration file /etc/nginx/nginx.conf test failed"], "stdout": "", "stdout_lines": []}

So it is a configuration file test failure.
Line 122 of /etc/nginx/sites-enabled/website.com/conf is include fastcgi_params;

The keys exist inside of /etc/nginx/ssl/letsencrypt

The nginx.service is active/running.

With https, the browser says “This site can’t be reached”.
With http it gives 404 nginx error.

I have tried:

deleting /etc/nginx on the server and provisioning again
provisioning without SSL and back again to SSL
setting ssh_client_ip_lookup: false in group_vars/all/main.yml
setting the full subdomain as the site name, and without
changing the letsencrypt email address

strarsis · January 24, 2023, 11:01am

This indicates that the Linux distribution you are using on the staging system differs from what is used on production - and also what Trellis expects/supports.
Is this Ubuntu 20.04 LTS on the Staging server?

erit · January 24, 2023, 11:05am

Yes, Ubuntu 20.04 (LTS) x64 on staging, prod and dev.

strarsis · January 24, 2023, 11:09am

Is there actually a file /etc/nginx/fastcgi_params on that staging system?

erit · January 24, 2023, 11:16am

No, that file is not on staging, but I see it on prod. Quite a few missing on staging.

strarsis · January 24, 2023, 11:18am

What do you get when you invoke lsb_release -a on staging?

So the complete Trellis provision process runs, not just specific tags?

erit · January 24, 2023, 11:30am

Yes the whole provision process ran, except that it produced that nginx reload error.

Figured it out now thanks to you. Luckily I had installed trash-cli in order to trash /etc/nginx so I could restore the missing files that way. I ran provision again, and now everything works. I had attempted to clear that out before to troubleshoot, but it may have been while the IP was not propagated yet and the provision was failing for a different reason. I figured the files would be put back upon reprovision - but that was not the case! Thanks @strarsis