Server provisioning works but deploy doesn't

Running the server provisioning works and connects fine but when running the deploy script I get the below. Any ideas on what could be wrong? SSH was fine with server provisioning.


PLAY [Ensure necessary variables are defined] **********************************************************************************************************

TASK [Ensure environment is defined] *******************************************************************************************************************
skipping: [localhost]

PLAY [Test Connection] *********************************************************************************************************************************

TASK [connection : Require manual definition of remote-user] *******************************************************************************************
skipping: [...*]

TASK [connection : Specify preferred HostKeyAlgorithms for unknown hosts] ******************************************************************************
skipping: [...*]

TASK [connection : Check whether Ansible can connect as web] *******************************************************************************************
ok: [...* -> localhost]

TASK [connection : Warn about change in host keys] *****************************************************************************************************
skipping: [...*]

TASK [connection : Set remote user for each host] ******************************************************************************************************
skipping: [...*]

TASK [connection : Announce which user was selected] ***************************************************************************************************
skipping: [...*]

TASK [connection : Load become password] ***************************************************************************************************************
skipping: [...*]

PLAY [Deploy WP site] **********************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************
System info:
Ansible; Darwin
Trellis at “Remove potentially dangerous db_import option”

SSH Error: data could not be sent to remote host "...". Make sure
this host can be reached over ssh
fatal: [
...**]: UNREACHABLE! => {“changed”: false, “unreachable”: true}
to retry, use: --limit @/Users/aaronpitts/Sites/papayapods/trellis/deploy.retry

PLAY RECAP *********************************************************************************************************************************************
...* : ok=1 changed=0 unreachable=1 failed=0
localhost : ok=0 changed=0 unreachable=0 failed=0

Can you manually SSH into the server?

Yes, and the server playbook connected via SSH and worked. But the deploy playbook no.


1 Like

Looks like you’re using Ansible 2.3. You’ll probably want to add the Trellis updates for Ansible 2.3 compatibility (roots/trellis#813), or just try older Ansible:
pip install ansible==2.2.2

The troubleshooting docs show an example error message that looks similar, possible with a host key change, so you might need to clear the server’s host key from your local known_hosts and try again.

The troubleshooting docs also mention a tip that could potentially provide more debug information:

Golden rule to debugging any failed command with Ansible:
… 3. Re-run the command in verbose mode
ansible-playbook deploy.yml -vvvv -e "site=<domain> env=<environment>"
if necessary to get more details.

For anyone else who finds this thread, Trellis recently added a step to provisioning that regenerates the host’s key and I have yet to remember that when deploying. My first deploy always fails and I panic for 3 seconds until I remember to delete my known_hosts entry and try again.

1 Like

In other contexts, I’ve heard people express frustration about having to switch to more secure host keys, so here are a few more notes to help people understand and take control of their situation. See also this overview of SSH keys and known_hosts.

The host key the server offers is based on what your local machine SSH client requests and what the server is allowed to offer (per its configuration).

Trellis doesn’t cause regeneration of host keys, but the Trellis default configures the server to offer only the most secure host key types (ed25519 or rsa). If you have to change host keys, it means that an older less secure host key type (probably ecdsa type) slipped in to your known_hosts, probably because you built the server with a Trellis version prior to roots/trellis#744 (back when the server didn’t control key type) or made your first SSH connection to the server manually instead of via Trellis and your SSH client didn’t control key type.

You could configure your ~/.ssh/config SSH client in your local machine to only request secure host key types and never again have an issue (i.e., specify preferred HostKeyAlgorithms). If your ssh -V shows version 6.5 or newer, you may specify both the ed25519 and rsa types. If older, the rsa types only. (example lists).

Even if you haven’t specified secure HostKeyAlgorithms in your own ~/.ssh/config, Trellis does what it can to prevent host key changes, trying to help you get a secure host key type (ed25519 or rsa) from the very beginning with new servers. There are details in roots/trellis#798 about how Trellis tries to help you use secure host keys with minimal inconvenience (via a one time HostKeyAlgorithms SSH connection option). If your known_hosts happens to have the less secure ecdsa host key type, it’s a one-time procedure per server to a change to a more secure key for that server (update key type in known_hosts).

If someone doesn’t care for security, but only convenience, and doesn’t want to adjust HostKeyAlgorithms in ~/.ssh/config, here is a workaround (not recommended). The sshd role README describes how to customize [the sshd role] via variables:

You may redefine any variable found in templates/sshd_config.j2 or templates/ssh_config.j2. The default settings are viewable in defaults/main.yml. To override a setting, you could redefine your chosen variable in a file such as group_vars/all/main.yml or group_vars/all/security.yml.

The relevant variable to redefine in the case of host keys is sshd_host_keys. You would need to add back the less secure ecdsa type. Trellis doesn’t regenerate host keys, but this setting tells your server which of its host keys it is allowed to offer SSH clients that are trying to connect. Here is an example of allowing the server to offer the less secure ecdsa host key type:

 # group_vars/all/main.yml  --  this is less secure
  - /etc/ssh/ssh_host_ecdsa_key
  - /etc/ssh/ssh_host_ed25519_key
  - /etc/ssh/ssh_host_rsa_key

For me the issue was the server was provided to me from vendor with only port 2224 open. This required Using Non-standard SSH Port

[production] ansible_ssh_port=2224

Then after provisioning I needed to delete the entry in my known hosts and remove ansible_ssh_port=2224.

Then all was honky dory.