Cannot Deploy to Production (SSH Error)

When deploying to our staging server, everything works beautifully. Deploying to production, however, is a different story.

Staging and Production are both (separate) DO droplets, and are configured properly. I’ve checked all configuration files and have confirmed they’re also set up properly; I’ve even done a diff on the files from two Trellis projects, running on the exact same droplet build, version of Trellis/Bedrock, and nearly identical architecture, and they look identical in all the settings that are relevant to deployment. Yet, this deployment still doesn’t work and produces an SSH error:

Failed to connect to the host via ssh: Permission denied (publickey).

fatal: [138.197.29.202]: UNREACHABLE! => {
"changed": false, 
"unreachable": true
}**

Steps I’ve taken:

  • Recreated the droplet several times, using both the exact same settings, and then removing additional features like monitoring and backups, just to ensure the build was as clean as possible.
  • Removed the entries in known_hosts related to the droplets that were failing.
  • Manually connected (successfully) to the server via ssh.

Here is my verbose output from the production deployment:

Any help would be greatly appreciated.

I suspect your DNS settings need to be changed. You’re trying to connect to 138.197.29.202 but your domain doesn’t resolve to that IP:

PING firstamericanmerchant.com (72.52.171.53): 56 data bytes
64 bytes from 72.52.171.53: icmp_seq=0 ttl=47 time=63.719 ms

I changed the DNS back to our old server, which is why the DNS is resolving differently now. When attempting the deployment, the DNS was set up properly.

Is your Trellis hosts/production set up? Can you share its contents?

Oh! When you created the droplet did you add your SSH key to the server as part of the process?

1 Like

Good call @MWDelaney. @Silverjerk see if you can just login to the production droplet via SSH first. If you can’t do that then something’s wrong with your droplet config or your local config. If you can do that then something’s wrong with your Trellis config.

For completeness you say here that you connected manually with success. Was it with a key or a password?

1 Like

I noticed your gist output includes:

The authenticity of host '138.197.29.202 (138.197.29.202)' can't be established.
ED25519 key fingerprint is SHA256:KhEIUDlU32mrluOvo96KZBqeGgkJwW2MrVC9gvbhXCE.
Are you sure you want to continue connecting (yes/no)? yes

If you are having to accept a hostkey, perhaps it is just because you…

Removed the entries in known_hosts related to the droplets that were failing.

If, however, the new hostkey means this is your first connection to this iteration of the production server, that could mean that you haven’t yet run server.yml. If that’s the case, the server.yml playbook hasn’t yet created the web user that Trellis tries to use for deploys.

You certainly have run server.yml for staging, but have you run server.yml for production?

…and did you test connecting manually with the web user (the relevant user for deploys)?

1 Like

@Silverjerk specifically step 6 at the bottom of this page in the docs:

It is, and here’s the contents:

[production]
138.197.29.202

[web]
138.197.29.202

I can log into the server via SSH without issue. Also, to answer @MWDelaney’s other question, I did add the SSH key to the server during creation of the droplet.

@fullyint, I followed all of the usual steps, per the docs, which is why I’m so thoroughly baffled. I run an atext string when doing Trellis installs to make life a little easier, and I even went back and did everything manually to ensure I hadn’t missed anything.

I’m going to create a new droplet and provision and deploy again and follow the docs to the letter. I’m certain I’m missing something simple, and likely very obvious. Thanks for the assistance; if I find the solution (or realize my error), I will post the results here.

Not sure if this is helpful or not, but I always use DNS names here, rather than IPs. That way if the IP changes for any reason, Trellis can still provision and deploy. If you’re provisioning before making public DNS changes, a local HOSTS file entry does the trick to make it work until DNS is updated.

That’s a good tip, and makes a lot of sense. Thanks!

After trying again this morning, the deploying went off without a hitch. Maybe this was DNS related, and a propagation issue? I’m not certain, but it was resolved. Thanks for the quick replies, guys. This community is always responsive, helpful, and thorough. I appreciate it very much.

3 Likes