If you are on MacOS try to import your SSH key password into Keychain by running ssh-add -K
ssh-agent will forget this key, once it gets restarted during reboots.
Also make sure that you understand that trellis uses ssh agent forwarding to connect to your git repository as described in the docs. There is also a section on Github which contains a few tips and tricks for further troubleshooting if agent forwarding does not work.
@yyyyaaa Here is a possible explanation for why the task is hanging:
The git clone task will cause your server to reach out to the git.conceptual.site host. The Ansible docs for the related git module mention this:
If the task seems to be hanging, first verify remote host is in known_hosts. SSH will prompt user to authorize the first contact with a remote host.
The implication is that if your server doesn’t already “know” the git.conceptual.site host, the task could be hanging with the prompt The authenticity of host ... can't be established. Are you sure you want to continue connecting (yes/no)? – but Ansible doesn’t show you the prompt and it hangs, and you can’t just type yes because it is not an interactive session.
This usually doesn’t happen on the git clone task because the Trellis default is accept_hostkey: true (because of the default repo_accept_hostkey: true), avoiding the prompt that would cause the task to hang.
Good news. Trellis gives you a means to make the git.conceptual.site host known ahead of time, avoiding all the trouble above. You could add git.conceptual.site to group_vars/all/known_hosts.yml. This command may help you find the key to add:
Hi @fullyint, sorry for late reply but I’m using an older version of Trellis, so it doesn’t have that known_hosts file. How should I continue to solve this? If I have to upgrading Trellis to the newest version, what is the easiest way to do it?
If you haven’t tried it already, you may gain more info following the docs’ troubleshooting/debugging advice:
SSH into your server and manually run the command where Ansible failed.
Example: if a Git clone task failed during deploys, then SSH into the server as the web user (which is what deploys use) and run the manual command such as git clone . This will give you a much better clue as to what’s going wrong.
Running git clone on the server may reveal the problem, or may simply prompt you to accept the host key, after which deploys may work fine.
Ultimately you’ll want to update Trellis. For keeping Trellis updated, some keep their Trellis separate from Bedrock and Sage (basic idea). Others maintain a project combining Trellis, Bedrock, and Sage, using subtrees or cherry-picking commits (recommended).
I think a lot of people just update Trellis manually, grabbing the latest files from upstream master then 1) pasting in upstream files that they themselves haven’t customized in their local projct, then 2) identifying and incorporating updates in files they have modified (like group_vars files), then 3) committing the changes.
Hi @fullyint, would you mind if I follow up with another question? (because I’m in the process of doing the steps you suggested). So I have updated trellis to the latest version. Upon running vagrant up --provision it stopped at this role with a SSL error:
TASK [geerlingguy.daemonize : Download daemonize archive.] *********************
System info:
Ansible 2.2.1.0; Vagrant 1.9.0; Darwin
Trellis at "Check Ansible version before Ansible validates task attributes"
---------------------------------------------------
Failed to validate the SSL certificate for github.com:443. Make sure your
managed systems have a valid CA certificate installed. You can use
validate_certs=False if you do not need to confirm the servers identity but
this is unsafe and not recommended. Paths checked for this platform:
/etc/ssl/certs, /etc/pki/ca-trust/extracted/pem, /etc/pki/tls/certs,
/usr/share/ca-certificates/cacert.org, /etc/ansible
fatal: [default]: FAILED! => {"changed": false, "failed": true}
I have googled for 20 minutes but haven’t been able to find an explanation or a viable solution to this. I will appreciate very much if you could help me.
Failed to validate the SSL certificate for... errors are very often connectivity issues that resolve themselves if you try later (example), maybe 30-60 minutes later. Sometimes you can regain connectivity faster by changing your IP (enable a VPN, go to a coffee shop, etc.).
Such connectivity issues simply happen sometimes, but seem more frequent with dependencies pulling from github.com after you’ve been provisioning your VM multiple times in a short period (i.e., potentially a rate-limiting thing from GitHub).
Yes that issue was due to connectivity. I have tried all the keys from running
ssh-keyscan git.conceptual.site
but none of them worked for me. I have provisioned the server successfully but while running deploy script, I have this issue:
Git repo ssh://git@git.conceptual.site:2222/diffusion/1/igs.git cannot be
accessed. Please verify the repository exists and you have SSH forwarding set
up correctly.
What could be wrong? I have my local id_rsa.pub set up correctly and when I tried checking ssh this command it showed no error (we use phabricator for hosting projects): echo {} | ssh git@git.conceptual.site -p 2222 conduit conduit.ping
Just an update that having a version of Ansible that was installed with Homebrew, as opposed to pip can also cause this error. As of today looks like ansible 2.4.2 is the version we want. At least version 2.4.
I think it’s basically brew uninstall ansible and pip install ansible==2.4.2.
which ansible which python (needs to be 2 and not 3 until later this year, I read) which openssl
Might also be insightful. For me I’m using versions located in /usr/local/bin, which is in my $PATHbeforelocal/bin so that those commands will find binaries to use there before getting to system versions which are probably located in /usr/bin.
I’m not quite clear on how or why exactly the brewed Python pip installs one version of openssl, while the brew-installedopenssl is different.