I tested and can also confirm that the constant DISALLOW_INDEXING is showing as true on the website, however the hide from search engines option was unticked and the robots.txt file remained as above. Also ticking hide my site also appears to not change the robots.txt file.
For now I’ve manually created a robots.txt file and threw in a robots meta tag for good measure, but what is the correct process for this?
@Steve_de_Niese Did you ever find a more automated ‘Trellis-based’ solution, or have you stuck with the manually edited robots.txt file? I’ve got a client that needs the production server not to be indexed. Thanks!
UPDATE:
I’ve added the robots_tag_header config in ./group_vars/production/wordpress_sites.yml and have re-provisioned the server.
robots_tag_header:
enabled: true
This adds the X-Robots-Tag header line if set to true in ./trellis/roles/wordpress-setup/templates/wordpress-site.conf.j2
@Steve_de_Niese if you want to hide contents from search engines, you need to forget the robots.txt for the moment.
First, you must tag your pages as noindex: if you prevent search engines from crawling your website (using robots.txt) they cannot remove indexed pages from their index. You could try adding a meta tag <meta name="robots" content="noindex, nofollow"> into your HTML pages or you can add a meta x-robots tag through NGINX or Apache or LiteSpeed (your web server).
My two cents here: I prefer using an under-construction plugin. This will show an under-construction/maintenance page/message to non-logged in visitors, while logged in users can view and interact with the site.