start-stop-daemon: --exec vs --startas

start-stop-daemon is the classic tool on Debian and derived distributions to manage system background processes. A typical invokation from an initscript is as follows:

start-stop-daemon \
    --quiet \
    --oknodo \
    --start \
    --pidfile /var/run/daemon.pid \
    --exec /usr/sbin/daemon \
    -- -c /etc/daemon.cfg -p /var/run/daemon.pid

The basic operation is that it will first check whether /usr/sbin/daemon is not running and, if not, execute /usr/sbin/daemon -c /etc/daemon.cfg -p /var/run/daemon.pid. This process then has the responsibility to daemonise itself and write the resulting process ID to /var/run/daemon.pid.

start-stop-daemon then waits until /var/run/daemon.pid has been created as the test of whether the service has actually started, raising an error if that doesn't happen.

(In practice, the locations of all these files are parameterised to prevent DRY violations.)

Idempotency

By idempotence we are mostly concerned with repeated calls to /etc/init.d/daemon start not starting multiple versions of our daemon.

This might not seem to be particularly big issue at first but the increased adoption of stateless configuration management tools such as Ansible (which should be completely free to call start to ensure a started state) mean that one should be particularly careful of this apparent corner case.

In its usual operation, start-stop-daemon ensures only one instance of the daemon is running with the --exec parameter: if the specified pidfile exists and the PID it refers to is an "instance" of that executable, then it is assumed that the daemon is already running and another copy is not started. This is handled in the pid_is_exec method (source) - the /proc/$PID/exe symlink is resolved and checked against the value of --exec.

Interpreted scripts

However, one case where this doesn't work is interpreted scripts. Lets look at what happens if /usr/sbin/daemon is such a script, eg. a file that starts:

#!/usr/bin/env python
# [..]

The problem this introduces is that /proc/$PID/exe now points to the interpreter instead, often with an essentially non-deterministic version suffix:

$ ls -l /proc/14494/exe
lrwxrwxrwx 1 www-data www-data 0 Jul 25 15:18
                              /proc/14494/exe -> /usr/bin/python2.7

When this process is examined using the --exec mechanism outlined above it will be rejected as an instance of /usr/sbin/daemon and therefore another instance of that daemon will be incorrectly started.

--startas

The solution is to use the --startas parameter instead. This omits the /proc/$PID/exe check and merely tests whether a PID with that number is running:

start-stop-daemon \
    --quiet \
    --oknodo \
    --start \
    --pidfile /var/run/daemon.pid \
    --startas /usr/sbin/daemon \
    -- -c /etc/daemon.cfg -p /var/run/daemon.pid

Whilst it is therefore less reliable (in that the PID found in the pidfile could actually be an entirely different process altogether) it's probably an acceptable trade-off against the case of running multiple instances of that daemon.

This danger can be ameliorated by using some of start-stop-daemon's other matching tests, such as --user or even --name.

Comments (3)

Even with its own limitations, --name is precisely the current best option to use in such cases instead of --exec. I've got a patch that adds native support for interpreted scripts, which I should undust and push, so that calling code would not need to be aware if the program is interpreted or compiled.

July 28, 2014, 11:35 p.m. #
Anonymous

You probably forgot the most important sentence:
"systemd does not suffer from these problems, and it's coming to you soon."

July 29, 2014, 6:28 p.m. #
Charles Lindsey

Your example shows the use of --startas without any --exec. But current versions of start-stop-daemon (mine comes with ubuntu 14.04) insist that --exec must be used as well. So I have written
start-stop-daemon --start $QUIET --pidfile $PIDFILE -b --exec $PERL --startas $DAEMON
following the advice in http://www.per… (which, together with yours, comes near the top of Google ranking in all --startas searches). This seems to work and I suggest you include a similar example.
But there is another problem. All this is incorporated in the customary skeleton for init.d files:
start-stop-daemon --start $QUIET --pidfile $PIDFILE -b --exec $PERL --startas $DAEMON \
|| return 1
start-stop-daemon --start $QUIET --pidfile $PIDFILE -b \
--exec $PERL --startas $DAEMON --$DAEMON_ARGS \
|| return 2
which I still do not fully understand. But the 2nd start-stop-daemon should fail (daemon already running), but it doesn't. Maybe my daemon is slow in starting up, because I find that a sleep between the two calls solves the problem, though it still gives that return 2, which in nonsensical.
And whatever you do, do not try to use --startas in the stop section of your init.d script, because that will kill any other instance of $PERL that may happen to be running at the time.

July 6, 2015, 3:17 p.m. #
You should never need a "sleep" call - something is broken, somewhere..