SSH Fowarding fails during provisioning, works on vagrant ssh

I have just merged the most recent Trellis with my project today and tried reprovisioning the server. However when the process gets to the Install Dependencies with Composer task, it gives a warning of Do not run Composer as root/super user!.

And then it fails puling private repos because I am guessing it tries to get my ssh key from under the root account, instead of the vagrant account it has always used in the past. This was a working process till now so I’m looking to find out why it’s trying to run composer on the vagrant box as root all of a sudden.

Please note, I did not destroy and recreate the vagrant box going from 14.04 to 16.04. In production, I am not going to have the option to change to 14.04 for a while, so I’d like local vagrant to be the same Ubuntu version as production.

There is nothing unusual about the task or the setup. It’s the default setup. I am a Windows user, but I will try with this same commit tomorrow on a Linux host.

edit see this post for the most current.

dev.yml

---
- name: "WordPress Server: Install LEMP Stack with PHP 7.0 and MariaDB MySQL"
  hosts: web:&development
  become: yes
  remote_user: vagrant

composer task

- name: Install Dependencies with Composer
  command: composer install
  args:
    chdir: "{{ www_root }}/{{ item.key }}/{{ item.value.current_path | default('current') }}/"
  register: composer_results
  with_dict: "{{ wordpress_sites }}"
  changed_when: "'Nothing to install or update' not in composer_results.stderr"

I think Composer in development/Vagrant has always been run as root/super user unfortunately.

This must be an issue on Windows then. I just follow the same process on Ubuntu host, and it worked just fine. I’ll update this thread when I find the answer.

So here’s a question. When trying to provision the local vagrant box, there’s an error pulling the private repo. It’s obvious from the error it’s trying to do it from the root account.

When I login with vagrant ssh, I am obviously on the vagrant account, and cloning the git repo works fine. This is all done with the SSH forwarding setup.

I tried another working project with vagrant reload --provision and it did in fact connection as user root, however it didn’t have any problems with this step. Provisioned clean.

Exact same machine, differences is 1) it’s a multi site although the site has been working for about a year, and 2) it has the most current trellis commit.

The last time I updating from trellis for this site was about June.

==> default: TASK [wordpress-install : Install Dependencies with Composer] ******************
==> default: task path: /vagrant/roles/wordpress-install/tasks/main.yml:19
==> default: ESTABLISH LOCAL CONNECTION FOR USER: root
==> default: 192.168.50.5 EXEC /bin/sh -c '( umask 22 && mkdir -p "` echo $HOME/.ansible/tmp/ansible-tmp-1477016724.89-84162693693879 `" && echo "` echo $HOME/.ansible/tmp/ansible-tmp-1477016724.89-84162693693879 `" )'
==> default: 192.168.50.5 PUT /tmp/tmpanuyvQ TO /root/.ansible/tmp/ansible-tmp-1477016724.89-84162693693879/command
==> default: 192.168.50.5 EXEC /bin/sh -c 'LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1477016724.89-84162693693879/command; rm -rf "/root/.ansible/tmp/ansible-tmp-1477016724.89-84162693693879/" > /dev/null 2>&1'
==> default: System info:
==> default:   Ansible 2.0.2.0; Vagrant 1.8.1; Linux
==> default:   Trellis at "Enable per-site setup for permalink structure"
==> default: ---------------------------------------------------
==> default: Do not run Composer as root/super user! See https://getcomposer.org/root for
==> default: details
==> default: Loading composer repositories with package information
==> default:
==> default:
==> default:   [RuntimeException]
==> default:   Failed to execute git clone --mirror 'git@bitbucket.org:myuser
==> default: /private-repo.git' '/root/.composer/cache/vcs/git-bitbucket
==> default: .org-myuser-private-repo.git/'
==> default:   Cloning into bare repository '/root/.composer/cache/vcs/git-bitbucket.org-
==> default: myuser-private-repogit'...
==> default:   Warning: Permanently added the RSA host key for IP address '104.192.143.3'
==> default: to the list of known hosts.
==> default:   Error reading response length from authentication socket.
==> default:   Permission denied (publickey).
==> default:   fatal: Could not read from remote repository.
==> default:   Please make sure you have the correct access rights
==> default:   and the repository exists.

At the core of this issue is when I vagrant ssh, the ssh forwarding works and I can get to private repos. But during vagrant provisioning, using the root account apparently ssh forwarding is not working.

However during boot up, it gives the message:

==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2222
    default: SSH username: vagrant
    default: SSH auth method: private key
==> default: Machine booted and ready!

I am seriously stumped. Is there any other way to trace the vagrant process outside of -vvvv?
I supposed I can modify Vagrantfile to use the old 14.04 ubuntu just so I can have consistency and just destroy my working dev box.

edit vagrant destroy didn’t fix the problem. There’s some sort of SSH authentication/forwarding issue going on ONLY during provisioning. Not when I login with vagrant ssh

I don’t have a windows setup to test on, but here are some thoughts.

User: vagrant vs root

The composer install is running as root even though you see SSH username: vagrant during boot because the dev.yml playbook invokes sudo to perform its tasks and "the default value of become_user is root".

I haven’t tested or thought through all the implications, but you could try editing the composer install task to not run as the become_user by adding become: false to the task, e.g., above this line, like this:

 - name: Install Dependencies with Composer
   command: composer install
   args:
     chdir: "{{ www_root }}/{{ item.key }}/{{ item.value.current_path | default('current') }}/"
+  become: false
   register: composer_results
   with_dict: "{{ wordpress_sites }}"
   changed_when: "'Nothing to install or update' not in composer_results.stderr"

Edit: Now I’ve tested become: false. I’m running OS X as the host machine. With a private repo in my composer.json, the composer install task hangs, not printing this message:

The authenticity of host 'bitbucket.org (104.192.143.2)' can't be established.
RSA key fingerprint is SHA256:zzXQOXSRBEiUtuE8AikJYKwbHaxvSc0ojez9YXaGp1A.
Are you sure you want to continue connecting (yes/no)?

If I vagrant ssh and add bitbucket.org to my known_hosts, the composer install works on the next vagrant provision and the plugins appear to work. So, this approach of updating the known_hosts on the vm and adding become: false shows some promise. I’d love to hear if it works for you. (/edit)

Private repo

Your machine has one Trellis project that works and one that fails. The failing project’s composer attempts cloning a private repo. Does the successful project have to clone a private repo too and succeeds? Or perhaps the presence/absence of a private repo in the composer.json is a relevant difference between the projects?

SSH forwarding

The following two ideas are probably not relevant if in fact your successful project does clone a private repo:

  • Things work when you vagrant ssh and clone the repo, but not when you vagrant provision. That made me wonder if this was relevant: "vagrant ssh execs out to the actual OpenSSH ssh client. vagrant provision uses net-ssh".
  • The thread linked in the bullet above mentions some issues Windows users probably need to be aware of with ssh-pageant, perhaps already dealt with in the Trellis Windows docs re: SSH Forwarding.

I suppose if it comes down to it, you could copy your private SSH key onto the VM. I get the sense that some Windows users may be doing that with their Trellis projects.

Vagrant version

I noticed you’re running Vagrant 1.8.1 whereas the current Trellis requirement is Vagrant 1.8.5. I realize you’re trying to keep the VM on Ubuntu 14.04 but you may end up having to backup the DB, update Vagrant, and destroy and rebuild the VM. I understand you may adjust the Vagrantfile to keep the VM on 14.04 to match your production server, but it may complicate the community’s ability to help debug (a forked version of) the project.

Thanks for the response. It’s mostly working, so at this point I say good enough. Some notes:

  • My working project does in fact have the exact same private repo in it, and it’s working fine when reprovisioning, composer update, etc.
  • I rolled back to vagrant 1.8.1 to see if that was the issue when I first upgraded to current trellis. It wasn’t, I am back to 1.8.6
  • I did not have pageant.exe on my computer; everything worked fine without it. I tried adding it and loading my key into it. Still getting the same error
  • I also added become: false in the location you indicated. Still not working
  • After the vagrant destroy then vagrant up and the process fails on that step. I login with vagrant ssh and run composer install, it FINALLY asked me to add bitbucket.org to known hosts (sat there for 2 minutes).

I exited, then ran vagrant reload --provision and it came up with no issues. Then I fast-forwarded git to the commit that has all the new trellis updates, and ran vagrant reload --provision again. No issues.

So I have NO idea why it started doing this, or why I have to take all these crazy extra steps on just this ONE project, but it looks like I have all the updates current.

1 Like

Thanks @fullyint

I was getting the “Do not run Composer as root/super user!” error on OSX Sierra. Adding become: false got rid of the error and I have a local dev environment again. Now onto staging!

2 Likes

Do not run Composer as root/super user
is now fixed in roots/trellis#694

1 Like

Turns out, there is a known issue with net-ssh on Windows. You can find the details here

The vagrant community was made aware of the issue in 2015 [thread here]. Despite the fact that they’re kinda banking on a ruby fix, there are workarounds mentioned in the thread (i.e. use Pageant instead of ssh-agent).

Hope this helps someone.