Roots Discourse

Deployment fails due to failed connection to Git repo

Hello,

I have been scouring the web for all kinds of information and solutions, that I could find about this issue.

I haven’t been able to find any working solutions, and have tried all questionable things that has been suggested across threads and blog posts.

Here’s a checklist of all things that I have tried and how my setup is:

  • Can run provisions without problems, no warnings no errors.
  • Deployment fine works for staging but fails on production.
  • SSH Forwarding is enabled on both the box (AllowAgentForwarding yes in /etc/ssh/sshd_config) and the local machine (ForwardAgent yes in ~/.ssh/config)
  • SSH Agent (and forwarding) is working and running. Can successfully confirm all GitHub’s guides with regards to this.
  • Keys are added to the SSH Agent (ssh-add -K) list shows they are in the agent.
  • Keys are added to box’ ~/.ssh/authorized_keys and restarted.
  • I can SSH into the box as both root and www-data (web user) with no problem.
  • Keys added to GitHub (my own account), and visible from the .keys endpoint.
  • I have full access to the organization and the private repo in GitHub.
  • I don’t have any problems pushing, pulling or cloning from/to the private remote GitHub repo, using SSH.
  • Deployment has worked before and haven’t changed anything since last deploy,

My SSH config for the problematic host:

Host some.domain
HostName xxx.xx.xxx.xx
User someuser
IdentityFile ~/.ssh/my_key
AddKeysToAgent yes
UseKeychain yes
IdentitiesOnly yes
ForwardAgent yes

Extra debug info (-vvvv) doesn’t show any more information regarding this error, than already shown without verbose logging.

A log can be seen in this gist

Versions

  • Ansible 2.4.3.0
  • Trellis 1.0.2

Have you tried Can deploy to production, but not staging ?

Went through the thread and none of the proposed solutions have worked for us.

So I seem to have found the reason for why this failed.

After changing the no_log configuration in roles/deploy/tasks/update.yml to true, I discovered that the problem happened when it tried to fetch the latest changes from the remote git repository. The error was

remote: error: insufficient permission for adding an object to repository database ./objects
remote: fatal: failed to write object
error: unpack failed: unpack-objects abnormal exit

After digging into what could cause this, it seems like some how the user didn’t have permissions to write to .git from /srv/www/<site>/shared/source on the remote server.

After going into the source directory and manually running git fetch --tags origin and deploying, everything seemed to work.

But that still begs the question how this could have happened in the first place.