Deploy production fails

Hi everyone, I have a bit of a problem on deploying production. I tried with the trellis-cli command trellis deploy production and I receive this error:

[WARNING]: Unhandled error in Python interpreter discovery for host
********: Failed to connect to the host via ssh:
web@********: Permission denied (publickey).
fatal: [**********]: UNREACHABLE! => {"changed": false, "msg": "Data could not be sent to remote host \"**********\". Make sure this host can be reached over ssh: web@********: Permission denied (publickey).\r\n", "unreachable": true}

I connected via ssh to my server and the key is actually there, so I tried to manually first, re-provision with tail using ansible ansible-playbook server.yml --ask-vault-password -e env=production -vvvv

But I got this other error:

TASK [Gathering Facts] **********************************************************************************************************************
task path: /Users/ernestoianuario/ernestoianuario.com/trellis/server.yml:12
fatal: [*********]: FAILED! => {
    "msg": "The field 'become_pass' has an invalid value, which includes an undefined variable. The error was: 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute 'name'. 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute 'name'. 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute 'name'. 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute 'name'"
}

I tried to google it but nothing comes up. Any thought/suggestion?
Thanks in advance,

Ernesto

Can you connect to the server manually, using ssh, with the corresponding private key, same username?

Thanks for your answer, I just checked and your assumption was right, actually I cannot connect via ssh with web user, but I can with root user.
I am not sure to what “private key” you are referring to? How/where can I add it and connect? I assume that’s the issue here, but I am not sure on how to solve it, I am pretty much a newbie using trellis :sweat_smile:.
I was able to connect previously, in fact, I have a testing trellis website already on my server. I didn’t change any key afaik, I tried to decrypt all vault password and all is fine. Would you mind give me some tip please?

Sure, so usually you create a private-public keypair on your workstation and then you put the public key onto the web server (unter .ssh/authorized_keys), the private key stays on your system, ideally never transmitted elsewhere, it is used to authenticate against the public key.

Trellis handles this for your, too (which is really nice): You put the public keys into the public_keys/ directory of your Trellis project directory, then those keys are copied during provisioning (not deploy!). Are there any public keys in public_keys/ yet?

When you try to SSH as deployment user (usually web), use the -v flag to make the output more verbose. Then you can see what the SSH client is actually trying to use to authenticate, e.g. also your private keys.

Hi starsis and thank you again for helping me out! :raised_hands:
I confirm I do have .ssh/authorized_keys on my webserver:

ubuntu@XXXXXX:~/.ssh$ ls
authorized_keys  id_rsa  id_rsa.pub

I also have private and public keys on my workstation:

~/.ssh on 🐳 v23.0.6 (orbstack)
➜ ls
config			id_ed25519		id_rsa			known_hosts
config.trellis_backup	id_ed25519.pub		id_rsa.pub		known_hosts.old

But I have no public_keys under my trellis/public_keys folder:

XXXXXX/trellis/public_keys on  master [⇡] on 🐳 v23.0.6 (orbstack) via 🅐 v2.14.1
➜ ls -la
total 0
drwxr-xr-x   3 XXXXXX  staff   96 20 Dec 13:16 .
drwxr-xr-x  29 XXXXXX  staff  928  4 Jan 18:27 ..
-rw-r--r--   1 XXXXXX  staff    0 20 Dec 13:16 .gitkeep

Should I create id_rsa and id_rsa.pub files under the trellis/public_keys folder? Did I understand correctly? If I am getting this right, shouldn’t those keys be already there, since I have already deployed in production once? I don’t remember adding those at any point :thinking:
Thank you in advance!

Ernesto

You copy the public key to ./public_keys and then re-provision the Trellis server.
Then you SSH into the Trellis server using the web user (which is used for deployments).
When that works, then Trellis deploys will also work, as now Trellis should be able to SSH as web user as well.

Also see this page about how SSH keys should be managed in Trellis:

Hi again strarsis, I have copied the id_rsa.pub key into /public_keys folder but ansible-playbook deploy.yml --ask-vault-password -e "site=ernestoianuario.com env=production" command provision fails and gives the same error as before :frowning:

XXXXXX/trellis/public_keys on  master [⇡?] on 🐳 v23.0.6 (orbstack) via 🅐 v2.14.1
➜ ls
id_rsa.pub
TASK [Gathering Facts] ********************************************************************************************************************************
task path: /Users/XXXXXX/XXXXXX/trellis/server.yml:12
fatal: [XXXXXX]: FAILED! => {
    "msg": "The field 'become_pass' has an invalid value, which includes an undefined variable. The error was: 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute 'name'. 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute 'name'. 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute 'name'. 'ansible.parsing.yaml.objects.AnsibleUnicode object' has no attribute 'name'"
}

It appears that some password is not set or some user name missing.
It would be helpful for you if you use the Trellis CLI tool for running the Trellis provision and deploy,
as Trellis CLI does initial checks and ensures that necessary parameters are set:

Hi again starsis and thanks for you support. I have trellis cli already installed and tried both provision and deploy before the ansible attempt, but result is the same honestly.
If I haven’t mentioned it, trellis provision development works like a charm so I don’t really know what the difference can be?

Hi again starsis, I have double checked and on the other 3 projects I have, I don’t have any ./public_keys either. Is that a requirement during provision? I cannot find any reference to this on documentation.

This not a requirement in a sense that you can set-up those SSH keys yourself. But Trellis already offers a mechanism for installing the SSH keys used for deployment, hence it makes sense using it.

I see what you mean but still didn’t work unfortunately :frowning:

We need to differentiate between provisioning and deployment and where exactly things fail.

Does trellis provision production work?
And trellis provision development works fine, correct?

Hi strarsis and thanks for your patience :sweat_smile:
I should change topic’s title really, since the actual problem is with provisioning production not deploying it, as a matter of fact, weirdly, deploying production works.

trellis provision development :white_check_mark:

TASK [wordpress-install : Update WP Multisite Home URL] ************************
skipping: [default] => (item=ernestoianuario.local)

PLAY RECAP *********************************************************************
default                    : ok=121  changed=9    unreachable=0    failed=0    skipped=38   rescued=0    ignored=0

trellis deploy production :white_check_mark:

TASK [deploy : debug] **********************************************************
ok: [XXXXXX] => {
    "msg": "XXXXXX deployed as release 20230606072346"
}

TASK [deploy : Check if deploy_after scripts exist] ****************************

TASK [deploy : include_tasks] **************************************************

PLAY RECAP *********************************************************************
XXXXXX            : ok=36   changed=13   unreachable=0    failed=0    skipped=39   rescued=0    ignored=0
localhost                  : ok=0    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0

Does the group_vars/production/vault.yml in your Trellis project contain all the fields as those used by default?

Have you added additional configuration, parameters or flags to hosts/production?
https://github.com/roots/trellis/blob/02cfc360911c545611d624c01ce4d97f197c6d00/hosts/production

Hi strarsis, yes I have just used ansible-vault decrypt vault.yml to check it and yes it looks like all fields are there (I remember I used https://roots.io/salts.html to generate hashed passwords as per documentation.

vault_mysql_root_password: XXXXXX
vault_users:
  name: '{{ admin_user }}'
  password: XXXXXX
  salt: XXXXXX

vault_wordpress_sites:
  mywebsite.com:
    admin_password: XXXXXX
    env:
      db_password: XXXXXX
      auth_key: "XXXXXX"
      secure_auth_key: "XXXXXX"
      logged_in_key: "XXXXXX"
      nonce_key: "XXXXXX"
      auth_salt: "XXXXXX"
      secure_auth_salt: "XXXXXX"
      logged_in_salt: "XXXXXX"
      nonce_salt: "XXXXXX"

Also my on the host/production I have my server IP added for production and web:

# Add each host to the [production] group and to a "type" group such as [web] or [db].

# List each machine only once per [group], even if it will host multiple sites.

[production]
XXXXXX

[web]
XXXXXX

What ansible version are you using? $ ansible --version

And one thing to note is that the ansible versions are not the same as the ansible-core versions: Releases and maintenance — Ansible Documentation
As ansible 8 contains ansible-core 2.15.

I use this version => core 2.14.1

XXXXXX on  master [!] on 🐳 v23.0.6 (orbstack)
➜ ansible --version
ansible [core 2.14.1]

In my requirement.txt

ansible>=2.10.0
ansible-core<2.13.6
passlib

Do you think I need to update ansible-core to 2.15 version? It might that causing the issue?

I am currently using 2.9.6 and deploying and provisioning fine. Could be that you need to downgrade to 2.9.

1 Like

After comparing the ansible files, it would not hurt trying out a different ansible version?
You could always revert to the previous version if this does not affect anything.