Upstream timed out for load-styles.php

We have noticed an issue on several sites that we deploy with Trellis+Bedrock.
It might be a bug in WordPress, but I post here because I only notice the behaviour in combination with Trellis. I hope to find other people that have the same issue. I will keep this post updated with my findings.

Bug description

After some time has passed after the deployment, we cannot access the WP-Admin.
The page just stays white and the Browser keeps loading. At some point, it loads but without the WP-Admin styles.

Debugging

While this is happening, the php process uses a lot of CPU (100%).
If you open another tab to access the WP-Admin, you will have 2 php processes at 100%.

Temporary Fix

The only way to fix it is to restart the (hanging) php-fpm process (or the whole server).
As this is also done in the finalizing stage of a deployment or provisioning, that also helps.
For now we are restarting the process at night or rebooting the server once a week.

Potential Cause

I looked in to the error.log and found this:

upstream timed out (110: Connection timed out) while reading response header from upstream,
client: [ā€¦],
server: example.com,
request: "GET /wp/wp-admin/load-styles.php?c=1&dir=ltr&load%5Bchunk_0%5D=dashicons,admin-bar,common,forms,admin-menu,dashboard,list-tables,edit,revisions,media,themes,about,nav-menus,wp-pointer,widgets&load%5Bchunk_1%5D=,site-icon,l10n,buttons,wp-auth-check&ver=6.4.3 HTTP/2.0",
upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock",
host: "example.com",
referrer: "https://example.com/wp/wp-admin/"

I looked in to the mentioned file, and changed the error_reporting to -1 to see what was happening.
Additionally, I added some error_log calls before/after the required files to see what is loaded.
Now, I get this (added line-breaks for readability):

FastCGI sent in stderr: "
PHP message: start;
PHP message: loaded noop;
PHP message: loaded class-wp-theme-json-resolver;
PHP message: loaded resolver;
PHP message: loaded global-styles-and-settings;
PHP message: loaded script-loader;
PHP message: loaded version;
PHP message: PHP Deprecated:  urlencode(): Passing null to parameter #1 ($string) of type string is deprecated in /srv/www/example.com/releases/20240321183619/web/wp/wp-includes/script-loader.php on line 1655;
PHP message: PHP Deprecated:  file_exists(): Passing null to parameter #1 ($filename) of type string is deprecated in /srv/www/example.com/releases/20240321183619/web/wp/wp-includes/global-styles-and-settings.php on line 412
" while reading response header from upstream,
client: [ā€¦],
server: example.com,
request: "GET /wp/wp-admin/load-styles.php?c=1&dir=ltr&load%5Bchunk_0%5D=dashicons,admin-bar,site-health,common,forms,admin-menu,dashboard,list-tables,edit,revisions,media,themes,about,nav-menus,wp-poi&load%5Bchunk_1%5D=nter,widgets,site-icon,l10n,buttons,wp-auth-check&ver=6.4.3 HTTP/2.0",
upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:",
host: "example.com",
referrer: "https://example.com/wp/wp-admin/"

So it seems everything is loaded, but there is some problem afterwards.

Unfortunately, I did not debug this further yet, but I will the next time this issue happens.

From what I know, it could hang either at these lines:

$wp_styles = new WP_Styles();
wp_default_styles( $wp_styles );

or at getting the content of the styles

$content = get_file( $path ) . "\n";

So I will try to look in these calls more deeply.
Also in the Passing null to parameter #1 warnings from urlencode/file_exists.

Additional Context

On these sites we are using minimalistic child themes of either twentytwentythree or another twenty* theme.
So pretty standard behaviour, but block themes with FSE.


Possibly related

There was a similar bug in WordPress Core, which I reported and which was fixed in WordPress 6.3.
I reference this because here the problem was the symlinking/change of the real path to the WordPress directory after deployment. Maybe this is a factor here too.

2 Likes

Sounds like when this issue happens, it happens consistently? If so, I might try commenting out more and more of wp-admin/load-styles.php to try and narrow down whatā€™s causing it.

Otherwise, itā€™s mostly just guessing in the dark. If anything in Trellis is causing this (or triggering a WP bug), my only guesses would be related to symlinking or, less likely, PHP configs.

1 Like

Yes, it does happen consistenly. Unfortunately I had to go for the quick fix and restart PHP, so I did not have the chance to debug further (yet).

I am also suspecting a symlinking issue.

Looking into the file again, the constant definition of WP_CONTENT_DIR is also a potential issue, as the Bedrock config is not respected.

If thatā€™s the case, it is weird that it works initially.

I encounter the same issue on multiple sites for some time now with recent WordPress versions.

As a workaround I disabled scripts and styles concatenation. As this is only applied to the backend (admin), the impact on normal visitors (frontend) should be quite minimal.

Config::define('CONCATENATE_SCRIPTS', false);
2 Likes

Thatā€™s a good idea. With HTTP/2/3 we donā€™t need the concatenation anyways.
The only benefit here would be the added caching headers, which we (Trellis users) can easily solve with nginx. Iā€™d say itā€™s even faster to load the assets separately because it can be served through nginx instead of loading PHP and concatenating everything.

Iā€™d like to benchmark this. If the assumption is confirmed, it might be a good idea to:

  • disable CONCATENATE_SCRIPTS by default via trellis env vars
  • enable browser caching headers for static assets in app/wp by default

What do you think @swalkinshaw?

1 Like

Yep thatā€™s a great idea to test those two scenarios.

1 Like

For those who need it now, you can disallow load-styles.php with GitHub - ItinerisLtd/trellis-cve-2018-6389: Mitigate CVE-2018-6389 WordPress load-scripts / load-styles attacks

4 Likes

Hi there, just to chime in with a big thanks for this research and these suggestions. I was experiencing this on a big site and it was driving me nuts.

Solved after applying TangRufusā€™s patch - thanks very much for all you do! Will be applying this to all our sites.

Happy to participate in the narrowing down of this bug, Iā€™ll post here if I come up with anything useful.

2 Likes

:information_source: Just submitted a PR to Bedrock to disable script concatenation by default:

2 Likes

Just commenting to say weā€™ve also been experiencing this issue on a recent site. Weā€™ve used Trellis for years with the same base theme and we have only experienced this issue on 1 site recently (Trellis 1.21.0, PHP 8.1.27, Ubuntu 22.04.4). Itā€™s occurred several times on staging, and once on production.

We havenā€™t been able to find a way to consistently replicate this issue unfortunately and weā€™ve tried a bunch of things like using Apache Bench to do multiple concurrent requests, removing all plugins, re-running all cron jobs etc.

We can look at turning off the script concatenation, however weā€™re still super interested in finding out why this happens. Let us know if there is anything youā€™d like us to test / log etc.

Test turning off the script concatenation please :smiley: