Provisioned and now lost SSH access

Whoops. The VPS i’m using is set up to accept ssh connections on port 9394

So I configured as following:

# hosts/staging
[staging]
the.ip.address.56 ansible_ssh_port=9394
 
 [web]
the.ip.address.56

And

# roles/sshd/defaults/main.yml
sshd_config: sshd_config.j2

sshd_ports:
  - 9394 

....

# line 72
# ssh_config
# ----------------------------
ssh_config: ssh_config.j2
ssh_port: 9394

Also disabled root login in group_vars/all/security.yml

sshd_permit_root_login: false

I had set my account, mikeadmin as the asmin user

# group-vars/all/users.yml
admin_user: mikeadmin

# also using my bitbucket key
users:
  - name: "{{ web_user }}"
    groups:
      - "{{ web_group }}"
    keys:
      - "{{ lookup('file', '~/.ssh/id_atlassian.pub') }}"
  - name: "{{ admin_user }}"
    groups:
      - sudo
    keys:
      - "{{ lookup('file', '~/.ssh/id_atlassian.pub') }}"

And now even when I try to connect using a password, the connection is immediately refused.

ssh -p 9394 mikeadmin@the.ip.address.56

Is there some obvious stupid thing I’ve done here, or is it a subtle stupid thing?

I could see this happening if you ran server.yml once (disables root), then changed the sshd_ports and/or admin_user in local files. The next time you run server.yml, a couple problems could happen:

  • It can’t connect as root, so it tries admin_user. But admin_user was admin the last time you ran server.yml so only admin is enabled on the server, not mikeadmin. The mikeadmin user won’t be able to connect.
  • Unless ssh was set up on port 9394 from the beginning, you’d now be trying to connect on port 9394 but server.yml won’t yet have set up that port, so the connection would refuse.

ssh root@ip

If root can connect on port 22 (i,e, ssh root@ip), try connecting on regular port 22 and running server.yml so it can add the new port 9394 option and add the mikeadmin user:

  • omit ansible_port from hosts/staging for now
  • temporarily adjust ports configs:
sshd_ports:
  - 22
  - 9394
  • run ansible-playbook server.yml -e env=staging --tags users,sshd
  • then reverse the edits above (add port back to hosts/staging and leave only 9394 in sshd_ports) and run the same thing again:
    ansible-playbook server.yml -e env=staging --tags users,sshd
    This last command will leave only port 9394 operational and make sure future Trellis connection attempts try port 9394 instead of the default 22.

ssh admin@ip

If root can’t connect but admin can, still only on port 22 (i,e, ssh admin@ip), run server.yml to add the mikeadmin user and to configure port 9394.

  • change admin_user: admin
  • add this entry to users:
  - name: mikeadmin
    groups:
      - sudo
    keys:
      - "{{ lookup('file', '~/.ssh/id_atlassian.pub') }}"
  • take all the steps in section above
  • after running server.yml both times, change back to admin_user: mikeadmin and remove the name: mikeadmin stuff you added to users (will now be covered by the name: "{{ admin_user }}" stuff).

ssh -p 9394 admin@ip

If admin can connect only on port 9394, just add the mikeadmin user:

  • change admin_user: admin
  • add this entry to users:
  - name: mikeadmin
    groups:
      - sudo
    keys:
      - "{{ lookup('file', '~/.ssh/id_atlassian.pub') }}"
  • run ansible-playbook server.yml -e env=staging --tags users
  • change back to admin_user: mikeadmin and remove the name: mikeadmin stuff you added to users (will now be covered by the name: "{{ admin_user }}" stuff).

Other notes

Note that sshd_password_authentication: false by default in Trellis.

As you adjust the sshd configs, instead of editing role defaults, try this recommendation from the role’s README section Customize via variables

To override a setting, you could redefine your chosen variable in a file such as group_vars/all/main.yml or group_vars/all/security.yml

You’ve adjusted sshd_ports which sounds appropriate, but you probably want to change ssh_port back to 22 (sshd_ports != ssh_port). This ssh_port is the port your server will use when it tries to make ssh connections to other servers, like during the git clone task during deploys. GitHub probably only listens on port 22.

If none of the above helps, try adding the ansible_ssh_port=9394 to the IP under [web] in hosts/staging, and probably change that to the updated varname ansible_port (instead of ansible_ssh_port).

Also try removing that space before [web].

You might also prefer this format in your hosts/staging:

myshortname ansible_host=12.34.56.678 ansible_port=9394

[staging]
myshortname
 
[web]
myshortname
3 Likes

Thanks for the insightful response, man. You rock. It looks like what happened (this is after having the server manager bail me out) is that I had configured either Ferm or Fail2ban to block port 9394 and port 22 was already disabled.

I’m still trying to figure out the difference between a firewall and what Fail2ban does. And am not sure where in my Ansible configurations it had configured sshd to block on 9394.