Deploy production fails

Hi again starsis and thanks for you support. I have trellis cli already installed and tried both provision and deploy before the ansible attempt, but result is the same honestly.
If I haven’t mentioned it, trellis provision development works like a charm so I don’t really know what the difference can be?

Hi again starsis, I have double checked and on the other 3 projects I have, I don’t have any ./public_keys either. Is that a requirement during provision? I cannot find any reference to this on documentation.

This not a requirement in a sense that you can set-up those SSH keys yourself. But Trellis already offers a mechanism for installing the SSH keys used for deployment, hence it makes sense using it.

I see what you mean but still didn’t work unfortunately :frowning:

We need to differentiate between provisioning and deployment and where exactly things fail.

Does trellis provision production work?
And trellis provision development works fine, correct?

Hi strarsis and thanks for your patience :sweat_smile:
I should change topic’s title really, since the actual problem is with provisioning production not deploying it, as a matter of fact, weirdly, deploying production works.

trellis provision development :white_check_mark:

TASK [wordpress-install : Update WP Multisite Home URL] ************************
skipping: [default] => (item=ernestoianuario.local)

PLAY RECAP *********************************************************************
default                    : ok=121  changed=9    unreachable=0    failed=0    skipped=38   rescued=0    ignored=0

trellis deploy production :white_check_mark:

TASK [deploy : debug] **********************************************************
ok: [XXXXXX] => {
    "msg": "XXXXXX deployed as release 20230606072346"
}

TASK [deploy : Check if deploy_after scripts exist] ****************************

TASK [deploy : include_tasks] **************************************************

PLAY RECAP *********************************************************************
XXXXXX            : ok=36   changed=13   unreachable=0    failed=0    skipped=39   rescued=0    ignored=0
localhost                  : ok=0    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0

Does the group_vars/production/vault.yml in your Trellis project contain all the fields as those used by default?

Have you added additional configuration, parameters or flags to hosts/production?
https://github.com/roots/trellis/blob/02cfc360911c545611d624c01ce4d97f197c6d00/hosts/production

Hi strarsis, yes I have just used ansible-vault decrypt vault.yml to check it and yes it looks like all fields are there (I remember I used https://roots.io/salts.html to generate hashed passwords as per documentation.

vault_mysql_root_password: XXXXXX
vault_users:
  name: '{{ admin_user }}'
  password: XXXXXX
  salt: XXXXXX

vault_wordpress_sites:
  mywebsite.com:
    admin_password: XXXXXX
    env:
      db_password: XXXXXX
      auth_key: "XXXXXX"
      secure_auth_key: "XXXXXX"
      logged_in_key: "XXXXXX"
      nonce_key: "XXXXXX"
      auth_salt: "XXXXXX"
      secure_auth_salt: "XXXXXX"
      logged_in_salt: "XXXXXX"
      nonce_salt: "XXXXXX"

Also my on the host/production I have my server IP added for production and web:

# Add each host to the [production] group and to a "type" group such as [web] or [db].

# List each machine only once per [group], even if it will host multiple sites.

[production]
XXXXXX

[web]
XXXXXX

What ansible version are you using? $ ansible --version

And one thing to note is that the ansible versions are not the same as the ansible-core versions: Releases and maintenance — Ansible Documentation
As ansible 8 contains ansible-core 2.15.

I use this version => core 2.14.1

XXXXXX on  master [!] on 🐳 v23.0.6 (orbstack)
➜ ansible --version
ansible [core 2.14.1]

In my requirement.txt

ansible>=2.10.0
ansible-core<2.13.6
passlib

Do you think I need to update ansible-core to 2.15 version? It might that causing the issue?

I am currently using 2.9.6 and deploying and provisioning fine. Could be that you need to downgrade to 2.9.

1 Like

After comparing the ansible files, it would not hurt trying out a different ansible version?
You could always revert to the previous version if this does not affect anything.

Some ansible errors can be caused by an incompatible ansible version indeed, like this related issue:

Hi rguttersohn and thanks for your suggestion. Unfortunately downgrading to 2.9 didn’t help, I receive the same error, but I might be into something. I completely uninstalled both ansible-core and ansible and once I run installation for 2.9 with pip this message pops up:

XXXXXX/trellis on  master [!] on 🐳 v23.0.6 (orbstack) took 2.2s
➜ pip3 install -I ansible==2.9
WARNING: Skipping /opt/homebrew/lib/python3.11/site-packages/ansible_core-2.15.0.dist-info due to invalid metadata entry 'name'
Collecting ansible==2.9
  Using cached ansible-2.9.0-py3-none-any.whl
Collecting jinja2 (from ansible==2.9)
  Using cached Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting PyYAML (from ansible==2.9)
  Using cached PyYAML-6.0-cp311-cp311-macosx_11_0_arm64.whl (167 kB)
Collecting cryptography (from ansible==2.9)
  Using cached cryptography-41.0.1-cp37-abi3-macosx_10_12_universal2.whl (5.3 MB)
Collecting cffi>=1.12 (from cryptography->ansible==2.9)
  Using cached cffi-1.15.1-cp311-cp311-macosx_11_0_arm64.whl (174 kB)
Collecting MarkupSafe>=2.0 (from jinja2->ansible==2.9)
  Using cached MarkupSafe-2.1.3-cp311-cp311-macosx_10_9_universal2.whl (17 kB)
Collecting pycparser (from cffi>=1.12->cryptography->ansible==2.9)
  Using cached pycparser-2.21-py2.py3-none-any.whl (118 kB)
WARNING: Skipping /opt/homebrew/lib/python3.11/site-packages/ansible_core-2.15.0.dist-info due to invalid metadata entry 'name'
Installing collected packages: PyYAML, pycparser, MarkupSafe, jinja2, cffi, cryptography, ansible
WARNING: Skipping /opt/homebrew/lib/python3.11/site-packages/ansible_core-2.15.0.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /opt/homebrew/lib/python3.11/site-packages/ansible_core-2.15.0.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /opt/homebrew/lib/python3.11/site-packages/ansible_core-2.15.0.dist-info due to invalid metadata entry 'name'
Successfully installed MarkupSafe-2.1.3 PyYAML-6.0 ansible-2.9.0 cffi-1.15.1 cryptography-41.0.1 jinja2-3.1.2 pycparser-2.21
WARNING: Skipping /opt/homebrew/lib/python3.11/site-packages/ansible_core-2.15.0.dist-info due to invalid metadata entry 'name'

@strarsis that warning mentioning ‘name’ sounds suspicious, also, as mentioned ansible-core shouldn’t even be there. Any thoughts?

Just a heads up I am running Ansible 2.9 under Python 3.7. Again, I am currently having no issues deploying/provisioning both old and newer trellis/bedrock/sage sites.

I hate to lead you down the wrong track potentially. Do you use pyenv to manage your python versions? If so, it will make switching between versions very easy if downgrading to Python 3.7 and using it to install ansible 2.9 doesn’t solve the issue.

Hi rguttersohn, sorry for the delay of my answer and thanks for your help. After several researches, I’ve found out that, as strarsis suggested, my public ssh key was someway corrupted, so now I am able to deploy without any problem. I have also imported db successfully using wp-cli and installed composer’s dependencies, but now I have another issue now :disappointed_relieved:

Basically website’s showing a blank page, although, I have access to wp-admin and there are no errors on sites/logs, network or console.log on the website itself :frowning:
Any thoughts?

Ernesto

Anything in the source of the page, fully empty?
Can you open another page than the front page (as using the “View” link in the pages list in admin area)?

In the admin area, go to Design, Themes and check whether the correct theme is enabled.
Also verify that under Settings, Reading correct front / posts pages are set.

Hi @strarsis, thanks for the quick reply.
Yep, page fully empty

I am not using any post or page at the moment it’s a SP website so far and yes, my theme is actually active.
image
I noticed a couple of things though, there is a WP version mismatch between between prod and local, one of the plugin I have installed is not there and I can see that the name of the website on production is mysitename.local which is absolutely weird, isn’t it?
screenshot-2

I’ve had this happen to me when I have a namespace issue. For me it was MacOS and Ubuntu having different case sensitivities. However, I believe the issue was reported in my server’s error log.

Thanks for your answer, I have actually solved the issue checking error logs, basically web user didn’t have permission on vendor’s folder on the web server, once granted those everything worked!