Roots Discourse

"Install Python 2.x" task fails on reprovision, unable to get lock /var/lib/dpkg/lock-frontend

ansible
trellis

#1

I try to solve my own issues, but am stuck after hours of research.

Provisions fail on “Install Python 2.x” task due to locks:

TASK [Install Python 2.x] **********************************************

System info:

Ansible 2.5.3; Darwin

Trellis version (per changelog): “Update xdebug tunnel configuration”


non-zero return code

Shared connection to www.replaced.url closed.

fatal: [www.replaced.url]: FAILED! => {“changed”: false, “rc”: 100, “stdout”: “\r\n/usr/bin/python\r\nE: Could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporarily unavailable)\r\nE: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?\r\n”, “stdout_lines”: ["", “/usr/bin/python”, “E: Could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporarily unavailable)”, “E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?”]}


Shared connection to replaced.url closed.

I can SSH to production and confirm with ps that no apt process holds a lock. Lock files can be removed without issue.

which python shows 2.7.12 on production.

which python || sudo apt-get update && sudo apt-get install -qq -y python-simplejson runs fine

Background:

  • I upgraded Trellis in summer with help from https://tomoro.com.au/blog/how-to-update-trellis/
  • Staging and production each have their own individual Linode VPS
  • Re-provisioning to staging works
  • Rebooting production doesn’t help
  • I had to downgrade local Python to 2.7.15 to provision + deploy to staging
  • I’ve cleared out known_hosts and re-accepted a connection
  • Local Ansible is 2.5.3
  • Auth appears to be working correctly. The site has been live for over a year.
  • Everything else in the full provision runs without error

At this point I’m kind of at a loss. Many thanks.

I can share verbose logs as needed, but the relevant logs are too long / containing too many links and I need to head to bed as it’s pushing 5am.


#2

Oddly, waking up this morning the provision proceeded a bit further… at least temporarily:

Failed to lock apt for exclusive operation
The full traceback is:
File “/tmp/ansible_L05xg7/ansible_module_apt.py”, line 985, in main
cache.update()
File “/usr/lib/python2.7/dist-packages/apt/cache.py”, line 465, in update
raise LockFailedException(“Failed to lock %s” % lockfile)

failed: [www.replaced.url] (item={‘value’: u’present’, ‘key’: u’python-software-properties’}) => {
“changed”: false,
“invocation”: {
“module_args”: {
“allow_unauthenticated”: false,
“autoclean”: false,
“autoremove”: false,
“cache_valid_time”: 3600,
“deb”: null,
“default_release”: null,
“dpkg_options”: “force-confdef,force-confold”,
“force”: false,
“force_apt_get”: false,
“install_recommends”: null,
“name”: “python-software-properties”,
“only_upgrade”: false,
“package”: [
“python-software-properties”
],
“purge”: false,
“state”: “present”,
“update_cache”: null,
“upgrade”: null
}
},
“item”: {
“key”: “python-software-properties”,
“value”: “present”
}
}

and then:


TASK [ferm : ensure iptables rules are enabled] ******************************************************
System info:
Ansible 2.5.3; Darwin
Trellis version (per changelog): “Update xdebug tunnel configuration”

non-zero return code
iptables: No chain/target/match by that name.

Firewall rules rolled back.
fatal: [replaced.url]: FAILED! => {“changed”: false, “cmd”: [“ferm”, “–slow”, “/etc/ferm/ferm.conf”], “delta”: “0:00:00.202826”, “end”: “2018-11-13 09:52:54.126859”, “rc”: 1, “start”: “2018-11-13 09:52:53.924033”, “stderr_lines”: [“iptables: No chain/target/match by that name.”, “”, “Firewall rules rolled back.”], “stdout”: “”, “stdout_lines”: []}

non-zero return code
Another app is currently holding the xtables lock. Perhaps you want to use
the -w option?
iptables-restore: line 71 failed
Failed to run /sbin/iptables-restore
fatal: [www.replaced.url]: FAILED! => {“changed”: false, “cmd”: [“ferm”, “–slow”, “/etc/ferm/ferm.conf”], “delta”: “0:00:00.119715”, “end”: “2018-11-13 09:52:54.115654”, “rc”: 1, “start”: “2018-11-13 09:52:53.995939”, “stderr_lines”: [“Another app is currently holding the xtables lock. Perhaps you want to use the -w option?”, “iptables-restore: line 71 failed”, “Failed to run /sbin/iptables-restore”], “stdout”: “”, “stdout_lines”: []}

on subsequent provisioning attempts, the process is again halting at the Python installation task:

non-zero return code
OpenSSH_7.8p1, LibreSSL 2.7.3
debug1: Reading configuration data /Users/bj/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 48: Applying options for *
debug1: auto-mux: Trying existing master
debug1: Control socket “/Users/bj/.ansible/cp/ca2b3092e6” does not exist
debug2: resolving “www.replaced.url” port 22
debug2: ssh_connect_direct
debug1: Connecting to www.replaced.url [139.162.186.31] port
22.
debug2: fd 5 setting O_NONBLOCK
debug1: fd 5 clearing O_NONBLOCK
debug1: Connection established.
debug3: timeout: 9875 ms remain after connect
debug1: identity file /Users/bj/.ssh/id_rsa type 0
debug1: identity file /Users/bj/.ssh/id_rsa-cert type -1
debug1: identity file /Users/bj/.ssh/id_dsa type -1
debug1: identity file /Users/bj/.ssh/id_dsa-cert type -1
debug1: identity file /Users/bj/.ssh/id_ecdsa type -1
debug1: identity file /Users/bj/.ssh/id_ecdsa-cert type -1
debug1: identity file /Users/bj/.ssh/id_ed25519 type -1
debug1: identity file /Users/bj/.ssh/id_ed25519-cert type -1
debug1: identity file /Users/bj/.ssh/id_xmss type -1
debug1: identity file /Users/bj/.ssh/id_xmss-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_7.8
debug1: Remote protocol version 2.0, remote software version OpenSSH_7.2p2
debug1: match: OpenSSH_7.2p2 pat OpenSSH_7.0*,OpenSSH_7.1*,OpenSSH_7.2*,OpenS
SH_7.3*,OpenSSH_7.4*,OpenSSH_7.5*,OpenSSH_7.6*,OpenSSH_7.7* compat 0x04000002
debug2: fd 5 setting O_NONBLOCK
debug1: Authenticating to www.replaced.url:22 as ‘bjfresh’
debug3: hostkeys_foreach: reading file “/Users/bj/.ssh/known_hosts”
debug3: record_hostkey: found key type ED25519 in file
/Users/bj/.ssh/known_hosts:111
debug3: load_hostkeys: loaded 1 keys from www.replaced.url
debug3: hostkeys_foreach: reading file “/Users/bj/.ssh/known_hosts2”
debug3: order_hostkeyalgs: prefer hostkeyalgs: ssh-
ed25519-cert-v01@openssh.com,ssh-ed25519
debug3: send packet: type 20
debug1: SSH2_MSG_KEXINIT sent
debug3: receive packet: type 20
debug1: SSH2_MSG_KEXINIT received
debug2: local client KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,ecdh-
sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-
exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512
,diffie-hellman-group14-sha256,diffie-hellman-group14-sha1,ext-info-c
debug2: host key algorithms: ssh-ed25519-cert-v01@openssh.com,ssh-ed25519
,ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-
sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com
,rsa-sha2-512-cert-v01@openssh.com,rsa-sha2-256-cert-v01@openssh.com,ssh-rsa-
cert-v01@openssh.com,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-
sha2-nistp521,rsa-sha2-512,rsa-sha2-256,ssh-rsa
debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes
256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes
256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: MACs ctos: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-
sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-
sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256
,hmac-sha2-512,hmac-sha1
debug2: MACs stoc: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-
sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-
sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256
,hmac-sha2-512,hmac-sha1
debug2: compression ctos: none,zlib@openssh.com,zlib
debug2: compression stoc: none,zlib@openssh.com,zlib
debug2: languages ctos:
debug2: languages stoc:
debug2: first_kex_follows 0
debug2: reserved 0
debug2: peer server KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256@libssh.org,diffie-hellman-group-
exchange-sha256
debug2: host key algorithms: ssh-ed25519,ssh-rsa,rsa-sha2-512,rsa-sha2-256
debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes256-gcm@openssh.com,ae
s128-gcm@openssh.com,aes256-ctr,aes192-ctr,aes128-ctr
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes256-gcm@openssh.com,ae
s128-gcm@openssh.com,aes256-ctr,aes192-ctr,aes128-ctr
debug2: MACs ctos: hmac-sha2-512-etm@openssh.com,hmac-
sha2-256-etm@openssh.com,hmac-
ripemd160-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-512,hmac-
sha2-256,hmac-ripemd160
debug2: MACs stoc: hmac-sha2-512-etm@openssh.com,hmac-
sha2-256-etm@openssh.com,hmac-
ripemd160-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-512,hmac-
sha2-256,hmac-ripemd160
debug2: compression ctos: none,zlib@openssh.com
debug2: compression stoc: none,zlib@openssh.com
debug2: languages ctos:
debug2: languages stoc:
debug2: first_kex_follows 0
debug2: reserved 0
debug1: kex: algorithm: curve25519-sha256@libssh.org
debug1: kex: host key algorithm: ssh-ed25519
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC:
compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC:
compression: none
debug3: send packet: type 30
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug3: receive packet: type 31
debug1: Server host key: ssh-ed25519
SHA256:U72Z0/RU8T8mBbEwoqURWn4dA/q5SFc++cTcZAeYab8
debug3: hostkeys_foreach: reading file “/Users/bj/.ssh/known_hosts”
debug3: record_hostkey: found key type ED25519 in file
/Users/bj/.ssh/known_hosts:111
debug3: load_hostkeys: loaded 1 keys from www.replaced.url
debug3: hostkeys_foreach: reading file “/Users/bj/.ssh/known_hosts2”
debug3: hostkeys_foreach: reading file “/Users/bj/.ssh/known_hosts”
debug3: hostkeys_foreach: reading file “/Users/bj/.ssh/known_hosts2”
debug3: record_hostkey: found key type ED25519 in file
/Users/bj/.ssh/known_hosts2:92
debug3: load_hostkeys: loaded 1 keys from 139.162.186.31
debug1: Host ‘www.replaced.url’ is known and matches the
ED25519 host key.
debug1: Found key in /Users/bj/.ssh/known_hosts:111
debug3: send packet: type 21
debug2: set_newkeys: mode 1
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug3: receive packet: type 21
debug1: SSH2_MSG_NEWKEYS received
debug2: set_newkeys: mode 0
debug1: rekey after 134217728 blocks
debug2: key: /Users/bj/.ssh/id_rsa (0x7fddd870e680), agent
debug2: key: /Users/bj/.ssh/id_dsa (0x0)
debug2: key: /Users/bj/.ssh/id_ecdsa (0x0)
debug2: key: /Users/bj/.ssh/id_ed25519 (0x0)
debug2: key: /Users/bj/.ssh/id_xmss (0x0)
debug3: send packet: type 5
debug3: receive packet: type 7
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<rsa-sha2-256,rsa-sha2-512>
debug3: receive packet: type 6
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug3: send packet: type 50
debug3: receive packet: type 51
debug1: Authentications that can continue: publickey
debug3: start over, passed a different list publickey
debug3: preferred gssapi-with-mic,gssapi-keyex,hostbased,publickey
debug3: authmethod_lookup publickey
debug3: remaining preferred: ,gssapi-keyex,hostbased,publickey
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Offering public key: RSA
SHA256:4j8JW69A3/ugaz2DhJsYTWMCIlgjEl9m/pQRbxM5Yv0 /Users/bj/.ssh/id_rsa
debug3: send packet: type 50
debug2: we sent a publickey packet, wait for reply
debug3: receive packet: type 60
debug1: Server accepts key: pkalg rsa-sha2-512 blen 535
debug2: input_userauth_pk_ok: fp
SHA256:4j8JW69A3/ugaz2DhJsYTWMCIlgjEl9m/pQRbxM5Yv0
debug3: sign_and_send_pubkey: RSA
SHA256:4j8JW69A3/ugaz2DhJsYTWMCIlgjEl9m/pQRbxM5Yv0
debug3: sign_and_send_pubkey: signing using rsa-sha2-512
debug3: send packet: type 50
debug3: receive packet: type 52
debug1: Authentication succeeded (publickey).
Authenticated to www.replaced.url ([139.162.186.31]:22).
debug1: setting up multiplex master socket
debug3: muxserver_listen: temporary control path
/Users/bj/.ansible/cp/ca2b3092e6.nDBRCANSKEoRhvvn
debug2: fd 7 setting O_NONBLOCK
debug3: fd 7 is O_NONBLOCK
debug3: fd 7 is O_NONBLOCK
debug1: channel 0: new [/Users/bj/.ansible/cp/ca2b3092e6]
debug3: muxserver_listen: mux listener channel 0 fd 7
debug2: fd 5 setting TCP_NODELAY
debug3: ssh_packet_set_tos: set IP_TOS 0x20
debug1: control_persist_detach: backgrounding master process
debug2: control_persist_detach: background process is 41869
debug2: fd 7 setting O_NONBLOCK
debug1: forking to background
debug1: Entering interactive session.
debug1: pledge: id
debug2: set_control_persist_exit_time: schedule exit in 60 seconds
debug1: multiplexing control connection
debug3: fd 8 is O_NONBLOCK
debug3: fd 8 is O_NONBLOCK
debug1: channel 1: new [mux-control]
debug3: channel_post_mux_listener: new mux channel 1 fd 8
debug3: mux_master_read_cb: channel 1: hello sent
debug2: set_control_persist_exit_time: cancel scheduled exit
debug3: mux_master_read_cb: channel 1 packet type 0x00000001 len 4
debug2: process_mux_master_hello: channel 1 slave version 4
debug2: mux_client_hello_exchange: master version 4
debug3: mux_client_forwards: request forwardings: 0 local, 0 remote
debug3: mux_client_request_session: entering
debug3: mux_client_request_alive: entering
debug3: mux_master_read_cb: channel 1 packet type 0x10000004 len 4
debug2: process_mux_alive_check: channel 1: alive check
debug3: mux_client_request_alive: done pid = 41871
debug3: mux_client_request_session: session request sent
debug3: mux_master_read_cb: channel 1 packet type 0x10000002 len 311
debug2: process_mux_new_session: channel 1: request tty 1, X 0, agent 1,
subsys 0, term “xterm-color”, cmd "sudo -H -S -p “[sudo via ansible,
key=jqzybrdpcydfbscbrfwxlnzjdowapizv] password: " -u root /bin/sh -c ‘echo
BECOME-SUCCESS-jqzybrdpcydfbscbrfwxlnzjdowapizv; which python || sudo apt-get
update && sudo apt-get install -qq -y python-simplejson’”, env 1
debug3: process_mux_new_session: got fds stdin 9, stdout 10, stderr 11
debug2: fd 10 setting O_NONBLOCK
debug2: fd 11 setting O_NONBLOCK
debug1: channel 2: new [client-session]
debug2: process_mux_new_session: channel_new: 2 linked to control channel 1
debug2: channel 2: send open
debug3: send packet: type 90
debug3: receive packet: type 80
debug1: client_input_global_request: rtype hostkeys-00@openssh.com want_reply
0
debug3: receive packet: type 91
debug2: channel_input_open_confirmation: channel 2: callback start
debug1: Requesting authentication agent forwarding.
debug2: channel 2: request auth-agent-req@openssh.com confirm 0
debug3: send packet: type 98
debug2: client_session2_setup: id 2
debug2: channel 2: request pty-req confirm 1
debug3: send packet: type 98
debug1: Sending environment.
debug1: Sending env LANG = en_CA.UTF-8
debug2: channel 2: request env confirm 0
debug3: send packet: type 98
debug1: Sending command: sudo -H -S -p "[sudo via ansible,
key=jqzybrdpcydfbscbrfwxlnzjdowapizv] password: " -u root /bin/sh -c ‘echo
BECOME-SUCCESS-jqzybrdpcydfbscbrfwxlnzjdowapizv; which python || sudo apt-get
update && sudo apt-get install -qq -y python-simplejson’
debug2: channel 2: request exec confirm 1
debug3: send packet: type 98
debug3: mux_session_confirm: sending success reply
debug2: channel_input_open_confirmation: channel 2: callback done
debug2: channel 2: open confirm rwindow 0 rmax 32768
debug3: receive packet: type 99
debug2: channel_input_status_confirm: type 99 id 2
debug2: PTY allocation request accepted on channel 2
debug2: channel 2: rcvd adjust 2097152
debug3: receive packet: type 99
debug2: channel_input_status_confirm: type 99 id 2
debug2: exec request accepted on channel 2
debug3: receive packet: type 2
debug3: Received SSH2_MSG_IGNORE
debug3: receive packet: type 96
debug2: channel 2: rcvd eof
debug2: channel 2: output open -> drain
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 2 rtype exit-status reply 0
debug3: mux_exit_message: channel 2: exit message, exitval 100
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 2 rtype eow@openssh.com reply 0
debug2: channel 2: rcvd eow
debug2: channel 2: close_read
debug2: channel 2: input open -> closed
debug3: receive packet: type 97
debug2: channel 2: rcvd close
debug3: channel 2: will not send data after close
debug3: channel 2: will not send data after close
debug2: channel 2: obuf empty
debug2: channel 2: close_write
debug2: channel 2: output drain -> closed
debug2: channel 2: send close
debug3: send packet: type 97
debug2: channel 2: is dead
debug2: channel 2: gc: notify user
debug3: mux_master_session_cleanup_cb: entering for channel 2
debug2: channel 1: rcvd close
debug2: channel 1: output open -> drain
debug2: channel 1: close_read
debug2: channel 1: input open -> closed
debug2: channel 2: gc: user detached
debug2: channel 2: is dead
debug2: channel 2: garbage collecting
debug1: channel 2: free: client-session, nchannels 3
debug3: channel 2: status: The following connections are open:
#1 mux-control (t16 nr0 i3/0 o1/0 fd 8/8 cc -1)
#2 client-session (t4 r0 i3/0 o3/0 fd -1/-1 cc -1)

debug2: channel 1: obuf empty
debug2: channel 1: close_write
debug2: channel 1: output drain -> closed
debug2: channel 1: is dead (local)
debug2: channel 1: gc: notify user
debug3: mux_master_control_cleanup_cb: entering for channel 1
debug2: channel 1: gc: user detached
debug2: channel 1: is dead (local)
debug2: channel 1: garbage collecting
debug1: channel 1: free: mux-control, nchannels 2
debug3: channel 1: status: The following connections are open:
#1 mux-control (t16 nr0 i3/0 o3/0 fd 8/8 cc -1)

debug2: set_control_persist_exit_time: schedule exit in 60 seconds
debug3: mux_client_read_packet: read header failed: Broken pipe
debug2: Received exit status from master 100
Shared connection to www.replaced.url closed.

fatal: [www.replaced.url]: FAILED! => {
“changed”: false,
“rc”: 100,
“stdout”: “\r\n\r\n/usr/bin/python\r\nE: Could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporarily unavailable)\r\nE: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?\r\n”,
“stdout_lines”: [
“”,
“”,
“/usr/bin/python”,
“E: Could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporarily unavailable)”,
“E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?”
]
}

The one line above that stuck out was:
debug1: Control socket “/Users/bj/.ansible/cp/ca2b3092e6” does not exist, but I’m uncertain how relevant it is.


#3

Quick update: I’m able to get around this and provision by skipping the Python install process altogether (which I had tried last night as well, without luck, oddly), but being unable to trust a full provision on production isn’t ideal in the long run :confused:


#4

Thanks for all the great details :slight_smile:

Those are definitely some weird errors and unfortunately I don’t have much of an idea either :thinking: .

I have two not so ideal suggestions:

  1. try to at least reboot the machine first and try again (although I’m guessing you tried this more than once)
  2. build a new server + provision and migrate over db/content then switch DNS over

It’s not a real answer, but whenever we do major Trellis updates to our main roots.io server, we just build a new server and move over to it.


#5

Yes, I rebooted a few times after various reconfigurations. Building and provisioning a new server is a possibility, but not necessarily either a great use of my time or within the scope of the client’s budget, unfortunately.

For now, I know how to work around the issue, but the patchwork of workarounds that accumulate when taking shortcuts like this is not a great approach to development :wink:

Anyway, love the project and appreciate the help.