Trouble Setting Up SSH Forwarding, I think

When running ./bin/deploy.sh staging staging.kangapestcontrol.com as instructed in the docs I’m getting the following error:

  1. I understand how to generate SSH Keys, and I understand what they are and their purpose.
  2. I’ve added SSH-keys for a server to GitHub.com before so I could push and pull.
  3. I also understand SSH forwarding in principle

But this is my first time doing it as described and I feel like this one is over my head.

Here is my trellis/group_vars/all/users.yml file.

Trellis regenerates the server’s key after initial provisioning. Remove the server’s entry from ~/.ssh/known_hosts and deploy again.

1 Like

Thanks for the reply. Unfortunately deleting known_hosts and re-running didn’t resolve the issue.

1) @s3w47m88 Could you post the complete and verbose output (add -vvvv)?

ansible-playbook deploy.yml -e env=staging -e site=staging.OMITTED.com -vvvv

2) It looks like you’re using the default web_user so could you SSH in to the server as the web user, run this command, and share the output?

web@staging.OMITTED.com:~$ ls -alh /home/web/.ssh

drwx------ 2 web www-data 4.0K Mar 30 19:24 .
drwxr-xr-x 7 web www-data 4.0K Mar 30 19:24 ..
-rw------- 1 web www-data  399 Mar 30 18:43 authorized_keys
-rw-r--r-- 1 web www-data 2.1K Apr  1 01:54 known_hosts

We need to be sure that the /home/web/.ssh/known_hosts file exists and has permissions and ownership that will work for the web user (-rw-r--r-- 1 web www-data). If the file presence, permissions, or ownership differ from above, it will be helpful if you share any ideas on how such defaults might have been changed, and share which cloud hosting you’re using (e.g., DigitalOcean, AWS, etc.) and which base Ubuntu image you’re using from that provider.

3) Could you ensure your version of Trellis includes known_hosts-related updates from roots/trellis#799?


The known_hosts feature is not essential for all users. You may not need it. To temporarily get past this blocking task in your deploy, you could try commenting out the list of known_hosts and add this empty list definition just above or below:

known_hosts: []

Here is more conceptual info on known_hosts as related to this task in Trellis.

1 Like

Yep.

1 . https://gist.github.com/s3w47m88/2acf95fbb7ab1ea2eae3386a5005f7de

2 . web@kangapestcontrol:~/.ssh$ ls -lA total 12 -rw------- 1 web www-data 426 Apr 4 19:08 authorized_keys -rw------- 1 web www-data 1680 Apr 5 00:44 known_hosts -rw-r--r-- 1 web www-data 1680 Apr 4 22:38 known_hosts.old

It was root:root but I believe that’s because I deleted it. So when the ./bin/deploy.sh staging staging.kangapestcontrol.com re-ran it probably just relied on the system to choose what user recreated it. I didn’t check it’s permissions before deleted known_hosts.

3 . I have not yet attempted to update Trellis, and I don’t see anything in the docs explaining how to do that (maybe it’s there but I am only at like step 4 in the docs). I can confirm the files mentioned in that PR do not match my version. My version appears to be older.

4 . I tried known_hosts: [] but that didn’t produce a different result.

Thanks for posting the verbose output. It shows that your change to known_hosts: [] made a difference. Compared to your initial post, the Add known_hosts task no longer fails (no longer does anything).

The latter part of this post links a few threads on updating trellis, but I don’t think an update is required to fix the new/latest issue appearing in your verbose output.

I think you’ll be on your way if you heed the verbose output’s advice:

WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!
...

Offending RSA key in /home/web/.ssh/known_hosts:5
  remove with:
  ssh-keygen -f "/home/web/.ssh/known_hosts" -R gitlab.iteratemarketing.com

:arrow_up: Run that command on the server then try the deploy again.

1 Like

Thanks. I didn’t read through the verbose output. But I did see that error earlier on in my testing and tried that.

I did it again just now ( ssh-keygen -f "/home/web/.ssh/known_hosts" -R gitlab.iteratemarketing.com) but I’m getting the same result:

`TASK [deploy : Clone project files] ********************************************
System info:
Ansible 2.2.1.0; Darwin
Trellis at “Check Ansible version before Ansible validates task attributes”

MODULE FAILURE
Traceback (most recent call last):
File “/tmp/ansible_p2EVck/ansible_module_git.py”, line 1033, in
main()
File “/tmp/ansible_p2EVck/ansible_module_git.py”, line 920, in main
add_git_host_key(module, repo,
accept_hostkey=module.params[‘accept_hostkey’])
File
“/tmp/ansible_p2EVck/ansible_modlib.zip/ansible/module_utils/known_hosts.py”,
line 56, in add_git_host_key
File
“/tmp/ansible_p2EVck/ansible_modlib.zip/ansible/module_utils/known_hosts.py”,
line 193, in add_host_key
File
“/tmp/ansible_p2EVck/ansible_modlib.zip/ansible/module_utils/basic.py”, line
2300, in append_to_file
IOError: [Errno 13] Permission denied: ‘/home/web/.ssh/known_hosts’

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: IOError: [Errno 13] Permission denied: ‘/home/web/.ssh/known_hosts’
fatal: [staging.kangapestcontrol.com]: FAILED! => {“censored”: “the output has been hidden due to the fact that ‘no_log: true’ was specified for this result”}
…ignoring

TASK [deploy : Failed connection to remote repo] *******************************
System info:
Ansible 2.2.1.0; Darwin
Trellis at “Check Ansible version before Ansible validates task attributes”

Git repo git@gitlab.iteratemarketing.com:iteratemarketing/kanga-pest-
control.git cannot be accessed. Please verify the repository exists and you
have SSH forwarding set up correctly.
More info:

https://roots.io/trellis/docs/deploys/#ssh-keys
https://roots.io/trellis/docs/ssh-keys/#cloning-remote-repo-using-ssh-
agent-forwarding

fatal: [staging.kangapestcontrol.com]: FAILED! => {“changed”: false, “failed”: true}
to retry, use: --limit @/Users/spencerhill/Sites/kangapestcontrol.com/trellis/deploy.retry

PLAY RECAP *********************************************************************
localhost : ok=0 changed=0 unreachable=0 failed=0
staging.kangapestcontrol.com : ok=5 changed=0 unreachable=0 failed=1

➜ trellis git:(master) ✗`

Well, I guess that’s not the same error?

Ansible’s git module will try to add the git repo’s hostkey and maybe it’s struggling with the file mode 600 you posted above.

Try SSHing into the server (probably as web user) and
chmod 644 /home/web/.ssh/known_hosts
Then try the deploy again.

If that fails, see if you can git clone manually on the server, e.g., SSH into the server,

$ cd /tmp
$ git clone git@gitlab.iteratemarketing.com:iteratemarketing/kanga-pest-control.git

The error may become clear. If it ends up succeeding, try the deploy again.

1 Like

Still no luck.

`root@kangapestcontrol:~# chmod 644 /home/web/.ssh/known_hosts
root@kangapestcontrol:~# cd /tmp
root@kangapestcontrol:/tmp# git clone git@gitlab.iteratemarketing.com:iteratemarketing/kanga-pest-control.git
Cloning into ‘kanga-pest-control’…
The authenticity of host ‘gitlab.iteratemarketing.com (50.112.185.53)’ can’t be established.
ED25519 key fingerprint is SHA256:moaEbSx88/SMsAmDAnJhfGPuxyKL69P1L236M0Fv2zA.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘gitlab.iteratemarketing.com,50.112.185.53’ (ED25519) to the list of known hosts.
Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
root@kangapestcontrol:/tmp#`

Ok, the manual git clone may not have worked because maybe your local ~/.ssh/config doesn’t have this:

# local machine ~/.ssh/config

Host staging.kangapestcontrol.com
  ForwardAgent true

But you won’t need that ~/.ssh/config entry for the deploy to work (Trellis ssh_args handle it). Retry the deploy now that you’ve done the chmod and likely have proper permissions.

1 Like

Okay, so I made a little progress.

I logged in as root, ran ssh-keygen, copied the key to my gitlab server and then cloned successfully.

I did the same thing with web@ and cloned successfully.

But I’m getting the same error when running the ansible command locally.

I also updated the .ssh/config. I did have it set to the IP instead of the hostname. And “Yes” instead of “true”.

Same result though. :confused: THANK YOU for your time. I’m out of time with this project so this is extremely appreciated.

1 Like

Scratch that! I reran it after the steps above and it went through. Now there is a new error though. Grr… I’ll debug it and reopen a new topic if I can’t get past it.

In summary though, I think this whole situation would have been solved if I manually generated the SSH key while signed in as web user and then added that key to the GitLab server.

Thanks to everyone who helped.

1 Like

Okay, so this is now a repeatable issue (for me anyway).

I encounter the original issue and can overcome it if:

  1. I SSH into the server as web@.
  2. Execute ssh-keygen -f "/home/web/.ssh/known_hosts" -R gitlab.EXAMPLE.com
  3. Then execute a git clone of my repository in the /tmp directory
  4. Then re-run the original command ansible-playbook deploy.yml -e env=staging -e site=staging.EXAMPLE.com

Is anyone else experiencing this issue?

1 Like

To clarify, regarding the steps of SSHing into the server and 1) trying a git clone and 2) removing keys from known_hosts on the server, the purpose was for debugging a problem, as was appropriate in the context above. These steps are not standard, not required for typical Trellis use.

To get your environment all set up for Trellis, and to discover how once your environment is set up, Trellis typically just works…

I’d be very surprised if the issues earlier in this thread were to appear on a fresh Trellis clone without modifications. Maybe just plug in your server IP in hosts/production and change the repo in wordpress_sites to

repo: git@github.com:roots/roots-example-project.com.git

for the sake of testing.

Okay, so your best guess is that the issue is:

  1. Not a bug with Trellis.
  2. Not likely a problem with my environment.
  3. Probably a mistake in my config files somewhere?

Thanks!

Other Trellis users do not report this thread’s initial problem: do_known_hosts: hostkeys_foreach ... failed: Permission denied, so that bit is nearly certainly not a Trellis bug (#1). Other issues that people report for known_hosts are typically due to lack of familiarity with intricacies of SSH, SSH forwarding, known_hosts, etc., so maybe you could call that config mistake or user mistake (#3).

The suggestion to test using an uncustomized version of Trellis is to tease out whether there might be a problem with your environment (#2), which would be really nice to separate and solve first before attempting to address issues with your particular config/customizations (#3).

In the case of this specific thread, the issue seems centered around your non-default config of a private repo on gitlab, hence my suggestion to try the public GitHub repo roots/roots-example-project.com just for testing.

1 Like