I’ve identified one additional cause for this error under Windows It happens when the group_vars/all/users.yml
file specifies a local public key using a ~
path to the user’s directory, e.g. ~/.ssh/keyname_rsa.pub
and a vagrant provision
is attempted.
Under Windows, Vagrant logs in to the VM as root when executing the shell provisioner, so when the Ansible fail2ban role is executed, in turn executing the users
related lookup
, it looks for the public key in /root/.ssh/
instead of /home/vagrant/.ssh/
and fails with the following error:
==> default: fatal: [127.0.0.1] => Failed to template {{ lookup('file', '~/.ssh/id_rsa_vagrant.pub') }}: could not locate file in lookup: /.ssh/id_rsa_vagrant.pub
id_rsa_vagrant.pub
is the public key I use to deploy to production, and I’ve put it in /home/vagrant/.ssh/
, as Ansible commands have to be run from the VM under Windows.
I fixed the error by editing my Vagrantfile, adding sh.privileged = false
in the following section:
if Vagrant::Util::Platform.windows?
config.vm.provision :shell do |sh|
sh.path = File.join(ANSIBLE_PATH, 'windows.sh')
# Fixes vagrant provision under Windows
sh.privileged = false
end
else
I discovered the source of the error message by comparing the verbose output under Windows and Linux.
Running > vagrant provision
from Windows (note the PUT...TO /root...
):
[...]
==> default: <127.0.0.1> PUT /tmp/tmpNPQtoS TO /root/.ansible/tmp/ansible-tmp-1443889992.18-46406665813175/command
==> default: <127.0.0.1> EXEC /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=ydznmgydwcersrmotyvypfgovohgxbtg] password: " -u root /bin/sh -c '"'" 'echo BECOME-SUCCESS-ydznmgydwcersrmotyvypfgovohgxbtg; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1443889992.18-46406665813175/command; rm -rf /root/.ansible/tmp/ansible-tmp-1443889992.18-46406665813175/ >/dev/null 2>&1'"'"''
==> default: ok: [127.0.0.1] => {"changed": false, "cmd": ["cat", "/etc/timezone"], "delta": "0:00:00.010372", "end": "2015-10-03 16:33:12.459853", "rc": 0, "start": "2015-10-03 16:33:12.449481", "stderr": "", "stdout": "Etc/UTC", "stdout_lines": ["Etc/UTC"], "warnings": []}
==> default:
==> default: TASK: [common | Set timezone] *************************************
[...]
Successfully running $ windows.sh
from the Vagrant Ubuntu VM (note PUT... TO /home/vagrant
):
[...]
<127.0.0.1> PUT /tmp/tmpf4qg2_ TO /home/vagrant/.ansible/tmp/ansible-tmp-1443890971.16-245571724493675/command
<127.0.0.1> EXEC /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=nbhgweenwftolrdratizkytomjhxqgjk] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-nbhgweenwftolrdratizkytomjhxqgjk; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /home/vagrant/.ansible/tmp/ansible-tmp-1443890971.16-245571724493675/command; rm -rf /home/vagrant/.ansible/tmp/ansible-tmp-1443890971.16-245571724493675/ >/dev/null 2>&1'"'"''
ok: [127.0.0.1] => {"changed": false, "cmd": ["cat", "/etc/timezone"], "delta": "0:00:00.010763", "end": "2015-10-03 16:49:31.419609", "rc": 0, "start": "2015-10-03 16:49:31.408846", "stderr": "", "stdout": "Etc/UTC", "stdout_lines": ["Etc/UTC"], "warnings": []}
TASK: [common | Set timezone] *************************************************
[...]
As an addendum, am I configuring local public keys incorrectly? After thinking about this issue, it occurred to me that destroying the VM and starting over with vagrant up
will break. The fail2ban role will execute the users
related lookup
, which will fail because I won’t have had the chance to copy the production deployment public key over to the VM’s /home/vagrant/.ssh/
directory.
This key is only meant for production deployment though, and shouldn’t affect an initial vagrant up
, so it doesn’t seem correct to define it in the “all” group_vars directory. To have it only apply to the “production” group_vars, I’d have to duplicate the whole users
object, which would complicate future maintenance.