Trellis deploy hangs, then fails on reload php-fpm

I have updated trellis to rc2 and am trying to get one of my sites deployed to our staging server. Every time it hangs for a few minutes and then fails on TASK [deploy: Reload php-fpm]

I have included the entire verbose output below. To address a few thoughts ahead of time, I can not provision the remote server as our IT department owns and maintains it. The trellis rc2 installation i am using runs PHP 7.1

The remote server is running 
[STAGING][~] php -v
PHP 7.0.10 (cli) (built: Nov  2 2016 14:15:54) ( NTS )
Copyright (c) 1997-2016 The PHP Group
Zend Engine v3.0.0, Copyright (c) 1998-2016 Zend Technologies
[STAGING][~] php-fpm -v
PHP 7.0.10 (fpm-fcgi) (built: Nov  2 2016 14:16:45)
Copyright (c) 1997-2016 The PHP Group
Zend Engine v3.0.0, Copyright (c) 1998-2016 Zend Technologies


Could that be causing the potential problem. Our setup uses our personal user account to ssh into the server through ansible and then becomes a service account that we utilize to prevent against permissions issues with user:group, so no one person has dedicated permissions but the service account does.



Error output from deploy...


TASK [deploy : Reload php-fpm] *****************************************************************************************************************************************************************************
task path: /home/tackettz/environments/trellis-rc2/roles/deploy/hooks/finalize-after.yml:36
Using module file /usr/local/lib/python2.7/dist-packages/ansible/modules/commands/command.py
<www-test> ESTABLISH SSH CONNECTION FOR USER: tackettz
<www-test> SSH: EXEC ssh -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=tackettz -o ConnectTimeout=10 -o ControlPath=/home/tackettz/.ansible/cp/3822b686ed www-test '/bin/sh -c '"'"'echo ~ && sleep 0'"'"''
<www-test> (0, '/home/tackettz\n', '')
<www-test> ESTABLISH SSH CONNECTION FOR USER: tackettz
<www-test> SSH: EXEC ssh -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=tackettz -o ConnectTimeout=10 -o ControlPath=/home/tackettz/.ansible/cp/3822b686ed www-test '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo /home/tackettz/.ansible/tmp/ansible-tmp-1518790210.36-38611988408079 `" && echo ansible-tmp-1518790210.36-38611988408079="` echo /home/tackettz/.ansible/tmp/ansible-tmp-1518790210.36-38611988408079 `" ) && sleep 0'"'"''
<www-test> (0, 'ansible-tmp-1518790210.36-38611988408079=/home/tackettz/.ansible/tmp/ansible-tmp-1518790210.36-38611988408079\n', '')
<www-test> PUT /tmp/tmp8AxwRC TO /home/tackettz/.ansible/tmp/ansible-tmp-1518790210.36-38611988408079/command.py
<www-test> SSH: EXEC sftp -b - -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=tackettz -o ConnectTimeout=10 -o ControlPath=/home/tackettz/.ansible/cp/3822b686ed '[www-test]'
<www-test> (0, 'sftp> put /tmp/tmp8AxwRC /home/tackettz/.ansible/tmp/ansible-tmp-1518790210.36-38611988408079/command.py\n', '')
<www-test> ESTABLISH SSH CONNECTION FOR USER: tackettz
<www-test> SSH: EXEC ssh -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=tackettz -o ConnectTimeout=10 -o ControlPath=/home/tackettz/.ansible/cp/3822b686ed www-test '/bin/sh -c '"'"'chmod u+x /home/tackettz/.ansible/tmp/ansible-tmp-1518790210.36-38611988408079/ /home/tackettz/.ansible/tmp/ansible-tmp-1518790210.36-38611988408079/command.py && sleep 0'"'"''
<www-test> (0, '', '')
<www-test> ESTABLISH SSH CONNECTION FOR USER: tackettz
<www-test> SSH: EXEC ssh -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=tackettz -o ConnectTimeout=10 -o ControlPath=/home/tackettz/.ansible/cp/3822b686ed -tt www-test '/bin/sh -c '"'"'/usr/bin/python /home/tackettz/.ansible/tmp/ansible-tmp-1518790210.36-38611988408079/command.py; rm -rf "/home/tackettz/.ansible/tmp/ansible-tmp-1518790210.36-38611988408079/" > /dev/null 2>&1 && sleep 0'"'"''
<www-test> (0, '[sudo] password for tackettz: \r\n\r\n{"changed": true, "end": "2018-02-16 09:15:11.306778", "stdout": "", "cmd": "sudo service php7.1-fpm reload", "failed": true, "delta": "0:05:00.306360", "stderr": "", "rc": 1, "invocation": {"module_args": {"warn": false, "executable": null, "_uses_shell": true, "_raw_params": "sudo service php7.1-fpm reload", "removes": null, "creates": null, "chdir": null, "stdin": null}}, "start": "2018-02-16 09:10:11.000418", "msg": "non-zero return code"}\r\n', 'Shared connection to lib-web-lp001.mgt.private closed.\r\n')
System info:
  Ansible 2.4.0.0; Linux
  Trellis at "Fix `failed_when` in `template_root` check with wp-cli 1.5.0"
---------------------------------------------------
MODULE FAILURE
Shared connection to lib-web-lp001.mgt.private closed.

[sudo] password for tackettz:

{"changed": true, "end": "2018-02-16 09:15:11.306778", "stdout": "", "cmd":
"sudo service php7.1-fpm reload", "failed": true, "delta": "0:05:00.306360",
"stderr": "", "rc": 1, "invocation": {"module_args": {"warn": false,
"executable": null, "_uses_shell": true, "_raw_params": "sudo service
php7.1-fpm reload", "removes": null, "creates": null, "chdir": null, "stdin":
null}}, "start": "2018-02-16 09:10:11.000418", "msg": "non-zero return code"}

fatal: [www-test]: FAILED! => {
    "changed": false, 
    "failed": true, 
    "rc": 0
}
	to retry, use: --limit @/home/tackettz/environments/trellis-rc2/deploy.retry

PLAY RECAP *************************************************************************************************************************************************************************************************
localhost                  : ok=0    changed=0    unreachable=0    failed=0   
www-test                   : ok=37   changed=20   unreachable=0    failed=1

Hi,
I’ve seen this happen when trying to deploy to a server I’ve provisioned with an old(er) version of Trellis. It may work to reprovision the server (ansible-playbook server.yml -e env=production) using your current version of Trellis.

If that doesn’t work you might need to (back up your database and uploads!!), rebuild and reprovision from scratch.

1 Like

I can’t provision the server though. I don’t have the ability or access to do that because it is owned and maintained by our IT department. However, they are setting up a schedule to update the server to RHEL 7 and I’m going to see about updating PHP and PHP-FPM on the servers. But that isn’t for a few more months.

Is there any other possible fix that you may know of?

Are you saying that this server is updated and maintained by your IT team rather than by Trellis? Trellis assumes it’s in charge of updates and provisions; unless you’re provisioning your servers with Trellis I don’t recommend deploying with it; you’re just asking for problems at that point.

Yes, the remote server is maintained and updated by our IT department. We just use it for hosting our WordPress sites. But we have not experienced any problems like this until the most recent trellis update. But I don’t see how that would cause any major problems.

@fullyint is worlds more qualified to assist with this kind of setup, but I think you might just have been getting lucky so far. Trellis is intended to be both your provisioning and deployment tool; if it can’t do both things, I think you’re better off using a different deployment method.

Easiest option, probably. Here’s a workaround, given your slightly unusual usage of Trellis (i.e., not provisioning with Trellis; deploying only).

  1. Adjust the Reload php-fpm task to know that it needs to provide your password. Indentation on become parameter matters; is the same as name, shell, and args.
  - name: Reload php-fpm
    shell: sudo service php7.1-fpm reload
    args:
      warn: false
+   become: yes
  1. Provide user’s sudo password when deploying:
    (or use -K as the short version of --ask-become-pass)
ansible-playbook deploy.yml -e env=stating -e site=example.com --ask-become-pass

It will prompt you to enter your user’s sudo password.

Assumptions. The above assumes the following:

  • service --status-all | grep php (on server) outputs only php7.1-fpm
  • you are able to ssh as web_user (as tackettz I think) and sudo service php7.1-fpm reload successfully, by providing a password

Passwordless sudo. If you choose to try to fix the problem instead of using the workaround above, there are some helpful notes on the related thread you already saw, especially this post.

You will probably have to coordinate with your IT department because it sounds like they have a specific user management strategy for this server. As a simple example, I expect your server would need an /etc/sudoers file where the last line is

#includedir /etc/sudoers.d

The server would also need a file such as /etc/sudoers.d/tackettz-services with the following:

tackettz ALL=(root) NOPASSWD: /usr/sbin/service php7.1-fpm *

where the permissions are like this

$ ls -alh /etc/sudoers.d
-r--r-----   1 root root   75 Feb  1 02:25 tackettz-services

PHP versions.

The php that runs on the cli is completely separate, as far as I know, and is not relevant.

2 Likes