Rote Install busted on Mac El Capitan

if I install the exact versions of the prerequistites on mac os, (10.11.6) and just follow the directions in the Installing Trellis and WordPress Sites, I do NOT get a running WP, it fails on the SSH when running the ansible-playbook. I did this all day using my modified site (my domain etc) and then decided to try using the example.com verion in the repos…

well, here’s what happened…

`Frodo:Projects me$ mkdir example.com
Frodo:Projects me$ cd !$
cd example.com
Frodo:example.com me$ ansible --version
ansible 2.0.2.0
config file =
configured module search path = Default w/o overrides
Frodo:example.com me$ vagrant --version
Vagrant 1.8.5
Frodo:example.com me$ vagrant-bindfs --version
-bash: vagrant-bindfs: command not found
Frodo:example.com me$ git clone --depth=1 git@github.com:roots/trellis.git && rm -rf trellis/.git
Cloning into ‘trellis’…
remote: Counting objects: 218, done.
remote: Compressing objects: 100% (172/172), done.
remote: Total 218 (delta 5), reused 139 (delta 1), pack-reused 0
Receiving objects: 100% (218/218), 68.84 KiB | 0 bytes/s, done.
Resolving deltas: 100% (5/5), done.
Checking connectivity… done.
Frodo:example.com me$ git clone --depth=1 git@github.com:roots/bedrock.git site && rm -rf site/.git
Cloning into ‘site’…
remote: Counting objects: 33, done.
remote: Compressing objects: 100% (29/29), done.
remote: Total 33 (delta 1), reused 19 (delta 1), pack-reused 0
Receiving objects: 100% (33/33), 14.00 KiB | 0 bytes/s, done.
Resolving deltas: 100% (1/1), done.
Checking connectivity… done.
Frodo:example.com me$ cd trellis && ansible-galaxy install -r requirements.yml

  • downloading role ‘composer’, owned by geerlingguy
  • downloading role from https://github.com/geerlingguy/ansible-role-composer/archive/1.2.7.tar.gz
  • extracting composer to vendor/roles/composer
  • composer was installed successfully
  • downloading role ‘ntp’, owned by resmo
  • downloading role from https://github.com/resmo/ansible-role-ntp/archive/0.3.0.tar.gz
  • extracting ntp to vendor/roles/ntp
  • ntp was installed successfully
  • downloading role ‘logrotate’, owned by nickhammond
  • downloading role from https://github.com/nickhammond/ansible-logrotate/archive/fc3ea4.tar.gz
  • extracting logrotate to vendor/roles/logrotate
  • logrotate was installed successfully
  • downloading role ‘swapfile’, owned by kamaln7
  • downloading role from https://github.com/kamaln7/ansible-swapfile/archive/0.4.tar.gz
  • extracting swapfile to vendor/roles/swapfile
  • swapfile was installed successfully
  • downloading role ‘daemonize’, owned by geerlingguy
  • downloading role from {New Users can only PUT 4 LINKS in a POST. Sigh.}
  • extracting geerlingguy.daemonize to vendor/roles/geerlingguy.daemonize
  • geerlingguy.daemonize was installed successfully
  • downloading role ‘mailhog’, owned by geerlingguy
  • downloading role from {New Users can only PUT 4 LINKS in a POST. Sigh.}
  • extracting mailhog to vendor/roles/mailhog
  • mailhog was installed successfully
    [DEPRECATION WARNING]: The comma separated role spec format, use the yaml/explicit format
    instead…
    This feature will be removed in a future release. Deprecation warnings can be disabled
    by setting deprecation_warnings=False in ansible.cfg.
  • dependency geerlingguy.daemonize is already installed, skipping.
    Frodo:trellis me$ vagrant up
    Bringing machine ‘default’ up with ‘virtualbox’ provider…
    ==> default: Importing base box ‘bento/ubuntu-16.04’…
    ==> default: Matching MAC address for NAT networking…
    ==> default: Checking if box ‘bento/ubuntu-16.04’ is up to date…
    ==> default: Setting the name of the VM: example.dev
    ==> default: Fixed port collision for 22 => 2222. Now on port 2200.
    ==> default: Clearing any previously set network interfaces…
    ==> default: Preparing network interfaces based on configuration…
    default: Adapter 1: nat
    default: Adapter 2: hostonly
    ==> default: Forwarding ports…
    default: 22 (guest) => 2200 (host) (adapter 1)
    ==> default: Running ‘pre-boot’ VM customizations…
    ==> default: Booting VM…
    ==> default: Waiting for machine to boot. This may take a few minutes…
    default: SSH address: 127.0.0.1:2200
    default: SSH username: vagrant
    default: SSH auth method: private key
    default: Warning: Remote connection disconnect. Retrying…
    default:
    default: Vagrant insecure key detected. Vagrant will automatically replace
    default: this with a newly generated keypair for better security.
    default:
    default: Inserting generated public key within guest…
    default: Removing insecure key from the guest if it’s present…
    default: Key inserted! Disconnecting and reconnecting using new SSH key…
    ==> default: Machine booted and ready!
    ==> default: Checking for guest additions in VM…
    ==> default: Setting hostname…
    ==> default: Configuring and enabling network interfaces…
    ==> default: Exporting NFS shared folders…
    ==> default: Preparing to edit /etc/exports. Administrator privileges will be required…
    Password:
    ==> default: Mounting NFS shared folders…
    ==> default: Mounting shared folders…
    default: /vagrant => /Volumes/BigDisk/ExternalUsers/me/Projects/example.com/trellis
    ==> default: Bindfs seems to not be installed on the virtual machine, installing now
    ==> default: Creating bind mounts for selected devices
    ==> default: Creating bind mount from /vagrant-nfs-example.com to /srv/www/example.com/current
    ==> default: Updating /etc/hosts file on active guest machines…
    ==> default: Updating /etc/hosts file on host machine (password may be required)…
    ==> default: Running provisioner: ansible…
    default: Running ansible-playbook…

PLAY [WordPress Server: Install LEMP Stack with PHP 7.0 and MariaDB MySQL] *****

TASK [setup] *******************************************************************
System info:
Ansible 2.0.2.0; Vagrant 1.8.5; Darwin
Trellis at “Fix #639 - WP 4.6 compatibility: update WP-CLI to 0.24.1”

Failed to connect to the host via ssh.
fatal: [default]: UNREACHABLE! => {“changed”: false, “unreachable”: true}
to retry, use: --limit @/Volumes/BigDisk/ExternalUsers/me/Projects/example.com/trellis/dev.retry

PLAY RECAP *********************************************************************
default : ok=0 changed=0 unreachable=1 failed=0

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.
Frodo:trellis me$
`

any ideas?

What happens if you run vagrant ssh?

it works fine. I become the vagrant user inside the vm

but if I try the
"ssh vagrant@localhost:2222"
( and I know I have the exact form of that command wrong here, I used the correct one I found in SO someplace)

the ssh command fails in the shell.

I notice this in your output, Fixed port collision for 22 => 2222. Now on port 2200, suggesting you had a Vagrant VM already running on 2222 and the new VM was moved to 2200, so your manual ssh to 2222 failed. It could help to shut down any other Vagrant VMs running. Check for VMs using vagrant global-status

Then run vagrant provision (or just vagrant up if you’ve destroyed the VM in question).


If the problem continues, it would be helpful to see the verbose debug output from Ansible. Pick one:

1. I think you’re on OS X so this command should do it:

ansible-playbook dev.yml -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory -vvvv

OR

2. temporarily insert
ansible.verbose = 'vvvv'
into your Vagrantfile between these two lines

Then run vagrant provision (or just vagrant up if you’ve destroyed the VM in question).


Could you also tell us whether you have any relevant configs in your ~/.ssh/config that might be affecting the SSH connection?

2 Likes

okay so there was three vagrants running, and I was able to destroy all three.

then I went back in example.com’s unmodified project. It failed as before:


$ /usr/local/bin/vagrant up
Bringing machine ‘default’ up with ‘virtualbox’ provider…
==> default: Importing base box ‘bento/ubuntu-16.04’…
==> default: Matching MAC address for NAT networking…
==> default: Checking if box ‘bento/ubuntu-16.04’ is up to date…
==> default: Setting the name of the VM: example.dev
==> default: Clearing any previously set network interfaces…
==> default: Preparing network interfaces based on configuration…
default: Adapter 1: nat
default: Adapter 2: hostonly
==> default: Forwarding ports…
default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Running ‘pre-boot’ VM customizations…
==> default: Booting VM…
==> default: Waiting for machine to boot. This may take a few minutes…
default: SSH address: 127.0.0.1:2222
default: SSH username: vagrant
default: SSH auth method: private key
default:
default: Vagrant insecure key detected. Vagrant will automatically replace
default: this with a newly generated keypair for better security.
default:
default: Inserting generated public key within guest…
default: Removing insecure key from the guest if it’s present…
default: Key inserted! Disconnecting and reconnecting using new SSH key…
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM…
==> default: Setting hostname…
==> default: Configuring and enabling network interfaces…
==> default: Exporting NFS shared folders…
==> default: Preparing to edit /etc/exports. Administrator privileges will be required…
==> default: Mounting NFS shared folders…
==> default: Mounting shared folders…
default: /vagrant => /r/Projects/example.com/trellis
==> default: Bindfs seems to not be installed on the virtual machine, installing now
==> default: Creating bind mounts for selected devices
==> default: Creating bind mount from /vagrant-nfs-example.com to /srv/www/example.com/current
==> default: Updating /etc/hosts file on active guest machines…
==> default: Updating /etc/hosts file on host machine (password may be required)…
==> default: Running provisioner: ansible…
default: Running ansible-playbook…

PLAY [WordPress Server: Install LEMP Stack with PHP 7.0 and MariaDB MySQL] *****

TASK [setup] *******************************************************************
System info:
Ansible 2.0.2.0; Vagrant 1.8.5; Darwin
Trellis at “Fix #639 - WP 4.6 compatibility: update WP-CLI to 0.24.1”

Failed to connect to the host via ssh.
fatal: [default]: UNREACHABLE! => {“changed”: false, “unreachable”: true}
to retry, use: --limit @/r/Projects/example.com/trellis/dev.retry

PLAY RECAP *********************************************************************
default : ok=0 changed=0 unreachable=1 failed=0

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.


and so then, as I am on OS X, I tried
ansible-playbook dev.yml -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory -vvvv

and “bad dog, no biscuit”…

$ ansible-playbook dev.yml -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory -vvvv
Using /r/Projects/example.com/trellis/ansible.cfg as config file
Loaded callback output of type stdout, v2.0

PLAYBOOK: dev.yml **************************************************************
1 plays in dev.yml

PLAY [WordPress Server: Install LEMP Stack with PHP 7.0 and MariaDB MySQL] *****

TASK [setup] *******************************************************************
<127.0.0.1> ESTABLISH SSH CONNECTION FOR USER: vagrant
<127.0.0.1> SSH: EXEC ssh -C -vvv -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s -o Port=2222 -o ‘IdentityFile="/r/Projects/example.com/trellis/.vagrant/machines/default/virtualbox/private_key"’ -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=vagrant -o ConnectTimeout=10 -o ControlPath=/r/.ansible/cp/ansible-ssh-%h-%p-%r 127.0.0.1 ‘/bin/sh -c ‘"’"’( umask 22 && mkdir -p “echo $HOME/.ansible/tmp/ansible-tmp-1471982159.81-201710437512138” && echo “echo $HOME/.ansible/tmp/ansible-tmp-1471982159.81-201710437512138” )’"’"’'
System info:
Ansible 2.0.2.0; Darwin
Trellis at “Fix #639 - WP 4.6 compatibility: update WP-CLI to 0.24.1”

Failed to connect to the host via ssh.
fatal: [default]: UNREACHABLE! => {“changed”: false, “unreachable”: true}
to retry, use: --limit @dev.retry

PLAY RECAP *********************************************************************
default : ok=0 changed=0 unreachable=1 failed=0

I don’t have a ~/.ssh/config file or directory

Nothing obvious stands out. I used the steps from initial post and can’t reproduce (OS X 10.11.6, Ansible 2.0.2.0, Vagrant 1.8.5).

I’m guessing it is some ssh config that differs on your control machine. You might try running the command with StrictHostKeyChecking disabled.

ansible-playbook dev.yml -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory -vvvv --ssh-extra-args="-o 'StrictHostKeyChecking no'"

If that works, it could indicate that your /etc/ssh/ssh_config has StrictHostKeyChecking yes. If so, you could change it to ask (typical default).


I’d be interested to see if Ansible’s connection to the VM succeeds if you use the exact ssh config output from running vagrant ssh-config. For example, my output is

Host default
  HostName 127.0.0.1
  User vagrant
  Port 2222
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /private/tmp/example.com/trellis/.vagrant/machines/default/virtualbox/private_key
  IdentitiesOnly yes
  LogLevel FATAL
  ForwardAgent yes

Try copying your vagrant ssh-config output into a new file ~/.ssh/config and edit/join the first two lines to look like this:

Host 127.0.0.1
  User vagrant
  Port 2222
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /private/tmp/example.com/trellis/.vagrant/machines/default/virtualbox/private_key
  IdentitiesOnly yes
  LogLevel FATAL
  ForwardAgent yes

Then run vagrant provision. If Ansible’s connection still fails, then I’m a bit stumped. If it succeeds, you can choose whether to just run with this ~/.ssh/config or try to keep digging into why it works and what you could adjust in your dev environment to no longer need the ~/.ssh/config.

One challenge is that the port could update, as we saw earlier/above, so the Port entry in this config could require a manual update at some point. Similarly, the IdentityFile could interfere with your future Vagrant VMs for 127.0.0.1 etc., so it might be nice to avoid this ~/.ssh/config.

So I am trying a fresh install on a different Mac (machine2).

When I first started this, I think there a much older version on the first Mac (machine1). Could there be binaries getting in the way via old PATH settings… so besides “vagrant” what other top level binaries might be called during ansible playbook running (things that are in the vagrant dist)?

could I be calling vagrant (1.8.5) but there are other ancillary binaries from an earlier vagrant version?

(oh, all this is because the above, latest ideas, did not work either. same error on SSH connection during the early part of the provisioning phase)

so yes, on a “clean” macbook pro, the example.com version works. so I re-cloned and started on my project and lo and behold it all works fine (on machine2). Machine1 is still scrogged.

SO it must be something related to SSH and/or earlier versions of vagrant ancillary birnaies that are still in the PATH?

I’m not familiar with the issue of problematic vagrant ancillary binaries. It sounds possible but isn’t something I’ve dealt with. Perhaps someone else can comment.

If you’re determined to use machine1, it seems that a thorough uninstall and reinstall (Vagrant and Ansible) would be a good place to start. Maybe you’ve tried already.

If you pursue this and resolve it, I hope you’ll report your solution.

1 Like

Also remember to uninstall/reinstall Vagrant plugins. Not exactly sure if a Vagrant uninstall cleans those out.

And you can check in /etc/hosts to see if any old hosts entries weren’t cleaned up.

1 Like

One minor thing. on Mac OSX, I needed to install passlib

In both machine1 and machine2 cases, passlib was an error thrown during the ‘vagrant up’

perhaps it should be a prereq in the instructions?

OKAY

the problem was my home directory, where my Projects are all held on machine1. The path is too long for a UNIX socket name, when $HOME has all the .ansible stuff added to it.

I was getting, at the end of my “vagrant up” or “vagrant provision”

==> default: Running provisioner: ansible…
default: Running ansible-playbook…

PLAY [WordPress Server: Install LEMP Stack with PHP 7.0 and MariaDB MySQL] *****

TASK [setup] *******************************************************************
System info:
Ansible 2.0.2.0; Vagrant 1.8.5; Darwin
Trellis at “Fix #639 - WP 4.6 compatibility: update WP-CLI to 0.24.1”

Failed to connect to the host via ssh.
fatal: [default]: UNREACHABLE! => {“changed”: false, “unreachable”: true}
to retry, use: --limit @/private/tmp/wordpress-tiogadigital.com/trellis/dev.retry

PLAY RECAP *********************************************************************
default : ok=0 changed=0 unreachable=1 failed=0

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

AND SO I tried to do a …

ansible-playbook dev.yml -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory -vvvv --ssh-extra-args="-o 'StrictHostKeyChecking no'"

and that would generate the same error. BUT It would also show me the ssh command it was trying to run:

<127.0.0.1> ESTABLISH SSH CONNECTION FOR USER: vagrant

<127.0.0.1> SSH: EXEC ssh -C -vvv -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s -o Port=2222 -o ‘IdentityFile="/private/tmp/wordpress-tiogadigital.com/trellis/.vagrant/machines/default/virtualbox/private_key"’ -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=vagrant -o ConnectTimeout=10 -o ‘StrictHostKeyChecking no’ -o ControlPath=/ReallyReallyReallyReallyReallyReallyLongPathNameWITHBigDISKSANDSTUFF/.ansible/cp/ansible-ssh-%h-%p-%r 127.0.0.1 ‘/bin/sh -c ‘"’"’( umask 22 && mkdir -p “echo $HOME/.ansible/tmp/ansible-tmp-1472140876.0-237841704934245” && echo “echo $HOME/.ansible/tmp/ansible-tmp-1472140876.0-237841704934245” )’"’"’’

SO I COPIED that SSH command to the shell and after much digital spewing, I’d get this at the end of the failed SSH command…

...
debug1: Trying private key: /private/tmp/wordpress-tiogadigital.com/trellis/.vagrant/machines/default/virtualbox/private_key
debug3: sign_and_send_pubkey: RSA SHA256:eVk08+u0E5XjV18cANRyURTWnNgmBKGUq2apMnBMVAA
debug2: we sent a publickey packet, wait for reply
debug1: Enabling compression at level 6.
debug1: Authentication succeeded (publickey).
Authenticated to 127.0.0.1 ([127.0.0.1]:2222).
debug1: setting up multiplex master socket
debug3: muxserver_listen: temporary control path /ReallyReallyReallyReallyReallyReallyLongPathNameWITHBigDISKSANDSTUFF/.ansible/cp/ansible-ssh-127.0.0.1-2222-vagrant.pFWB58BSPaBLzwmM
unix_listener: "/ReallyReallyReallyReallyReallyReallyLongPathNameWITHBigDISKSANDSTUFF/.ansible/cp/ansible-ssh-127.0.0.1-2222-vagrant.pFWB58BSPaBLzwmM" too long for Unix domain socket

huh?

(!!!!) .... too long for Unix domain socket

So then some googling and I made ansible.cfg in my trellis directory look like

<<...other stuff in file snipped...>>

[ssh_connection]
control_path = %(directory)s/%%h-%%r
ssh_args = -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s

See that control_path? it forces the path shorter, and then “good dog, have a chewie!”

yes, then I can vagrant provision and the ssh works fine.

SOLVED

3 Likes