Yes, this may be the case. You got console access with your DO account? Check the logs whether you IP has been banned because of too many failed SSH connection attempts. I had a similar issue with failing SSH connections as the host key algorithm was changed by Trellis (to something safer), but my SSH already stored the host key and didn’t like it.
I have exactly the same problem. Only that the server is on Hetzner and not DO. Provisioning errors exactly at the same task and also SSH says Connection refused for a couple of minutes afterwards.
SSH generally works (deployment also works, probably because no root access), so SSH keys should be OK. Never had this before.
I was wondering if fail2ban is interfering here in some way?
OK, I see that fail2ban uses that ip_whitelist list for fail2ban_ignoreip. And ip_whitelist includes ipify_public_ip, which is probably the current IP, when provisioning.
I guess it’s quite common that I’ll have a different IP when re-provisioning. So I’m wondering: Did re-provisioning from a different IP ever work?
30 minutes later:
Well, so I manually edited /etc/fail2ban/jail.local on the server, put my current IP in ignore_ip, then: systemctl restart fail2ban (forgot this the first time, cost me another 10 minutes) and then re-provisioning works just fine.
I guess I could write a script, that does that for me before every re-provisioning.
But seriously, this cannot be the solution, can it?
Yesterday I provisioned & deployed one of my sites on my laptop from my home IP to a DO droplet without any problems, today at work with same laptop and ssh keys provisioning fails.
As soon as I try to (re-)provision (with sshd_permit_root_login set to false), fail2ban bans my work IP address, resulting in the UNREACHABLE error. After waiting for 10 mins, I try ssh admin@droplet-ip directly, which works and add the IP address to the fail2ban ip_whitelist and restart fail2ban.
Now provisioning also works from my IP address.
Question is; why does fail2ban ban my IP address in the first place?
I’m not a fail2ban expert, but I guess the way it is configured in trellis (with ssh service) it only allows ssh root access from IPs in ip_whitelist resp. ignore_ip. Maybe there is a recent change in fail2ban somewhere to handle this more strictly. or something, because I can’t remember having this before. But it’s also possible that my IPs just did not change that often.
I just had to do that manual change again. So I guess I’ll write a script for that now.
Well, root access generally works of course, but I think every ansible task is a separate ssh connection, so there are several connections in a short amount of time and as you see in the log, only after the 5th task ssh becomes unreachable. Maybe there are certain fail2ban settings that could be tweaked or the ignore_ip setting could be changed as a first task in provisioning.