"104: Connection reset by peer" for a few seconds directly after deploy. How are you handling it?

Hello!

Something that I’ve noticed ever since I started using Trellis is that for a few seconds after deploying, while PHP-FPM is reloading, users get an error message if they try to access the site. It comes out like this in the log:

2025/09/17 12:08:20 [error] 4115444#4115444: *11726757 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: www.example.com, request: “GET /example/example-example/ HTTP/2.0”, upstream: “fastcgi://unix:/var/run/php-fpm-wordpress.sock:”, host: “www.example.com”, referrer: “https://www.google.com/

I could only find a single thread on here about this phenomenon:

So my question and the reason for creating this thread is: How do you handle these 5-6 seconds where the site will return an error message to the user?

Have you changed the setting like the OP in the above thread? If so, have you noticed any disadvantages?
Did you change something else?
Are you enduring it?

A penny for your thoughts!

1 Like

I’ve never seen this before, but maybe give this a shot? Add to group_vars/all/main.yml:

php_fpm_set_process_control_timeout: true
php_fpm_process_control_timeout: 10s

I actually tested that on my Staging server before creating the thread, and it does seem to work. Is there a reason why this isn’t the default? This behavior of there being 5 seconds of downtime on deploy has been an issue on all our sites.

Here’s how to reproduce the bug if you want to try it:

  1. Deploy staging
  2. The moment it’s done, go on your site and refresh repeatedly. Odds are that you’ll be met with an error

Assuming there are no downsides to this, do you want me to do a PR?

Please do a PR

Are you attempting this on a vanilla Bedrock install with no changes? I’ve never hit this issue before, and have visited sites immediately after deploys without problems many times since the creation of Trellis

PR created!

I’ve had the issue on two separate websites:
One using Bedrock with minimal changes.
One using Radicle with no changes

Both sites have this issue. Maybe not every time, but often enough that almost every deploy on a website with 150k monthly visitors will show an error in the log.

Thank you for the PR :folded_hands:

1 Like