start-stop-daemon is the classic tool on Debian and derived distributions to manage system background processes. A typical invokation from an initscript is as follows:
start-stop-daemon \ --quiet \ --oknodo \ --start \ --pidfile /var/run/daemon.pid \ --exec /usr/sbin/daemon \ -- -c /etc/daemon.cfg -p /var/run/daemon.pid
The basic operation is that it will first check whether /usr/sbin/daemon is not running and, if not, execute /usr/sbin/daemon -c /etc/daemon.cfg -p /var/run/daemon.pid. This process then has the responsibility to daemonise itself and write the resulting process ID to /var/run/daemon.pid.
start-stop-daemon then waits until /var/run/daemon.pid has been created as the test of whether the service has actually started, raising an error if that doesn't happen.
(In practice, the locations of all these files are parameterised to prevent DRY violations.)
By idempotence we are mostly concerned with repeated calls to /etc/init.d/daemon start not starting multiple versions of our daemon.
This might not seem to be particularly big issue at first but the increased adoption of stateless configuration management tools such as Ansible (which should be completely free to call start to ensure a started state) mean that one should be particularly careful of this apparent corner case.
In its usual operation, start-stop-daemon ensures only one instance of the daemon is running with the --exec parameter: if the specified pidfile exists and the PID it refers to is an "instance" of that executable, then it is assumed that the daemon is already running and another copy is not started. This is handled in the pid_is_exec method (source) - the /proc/$PID/exe symlink is resolved and checked against the value of --exec.
However, one case where this doesn't work is interpreted scripts. Lets look at what happens if /usr/sbin/daemon is such a script, eg. a file that starts:
#!/usr/bin/env python # [..]
The problem this introduces is that /proc/$PID/exe now points to the interpreter instead, often with an essentially non-deterministic version suffix:
$ ls -l /proc/14494/exe lrwxrwxrwx 1 www-data www-data 0 Jul 25 15:18 /proc/14494/exe -> /usr/bin/python2.7
When this process is examined using the --exec mechanism outlined above it will be rejected as an instance of /usr/sbin/daemon and therefore another instance of that daemon will be incorrectly started.
The solution is to use the --startas parameter instead. This omits the /proc/$PID/exe check and merely tests whether a PID with that number is running:
start-stop-daemon \ --quiet \ --oknodo \ --start \ --pidfile /var/run/daemon.pid \ --startas /usr/sbin/daemon \ -- -c /etc/daemon.cfg -p /var/run/daemon.pid
Whilst it is therefore less reliable (in that the PID found in the pidfile could actually be an entirely different process altogether) it's probably an acceptable trade-off against the case of running multiple instances of that daemon.
This danger can be ameliorated by using some of start-stop-daemon's other matching tests, such as --user or even --name.
You can subscribe to new posts via email or RSS.