2012-11-15

How to start and kill a Unix process tree

This blog post explains how to start a (sub)process on Unix, and later kill it so that all its children and other descendants are also killed. The sample implementation is written in Python 2.x (≥ 2.4), but it works equally well in other programming languages. It has been tested on Linux, but it should also work on other Unix systems such as *BSD and the Mac OS X.

For example, the following doesn't work as expected:

import os, signal, subprocess, time
p = subprocess.Popen('cat /dev/zero | wc', shell=True)
time.sleep(.3)
os.kill(p.pid, signal.SIGTERM)

The intention here is to kill the long-running command after 0.3 second, but this will kill the shell only, and both cat /dev/zero (the long-running pipe source) and wc will continue running, both of them consuming 100% CPU until killed manually. The trick for killing the shell and all descendant processes is to put them into a process group first, and to kill all processes in the process group in a single step. Doing so is very easy:

import os, signal, subprocess, time
p = subprocess.Popen('sleep  53 & sleep  54 | sleep  56',
                     shell=True, preexec_fn=os.setpgrp)
time.sleep(.3)
print 'killing'
os.kill(-p.pid, signal.SIGTERM)
p.wait()

By specifying preexec_fn=os.setpgrp we ask the child to call the setpgrp() system call (after fork() but before exec...(...)ing the shell), which will create a new process group and put the child inside it. The ID of the process group (PGID) is the same as the process ID (PID) of the child. All descendant processes of the child will be added to this process group. By specifying a negative PID to os.kill (i.e. the kill(...) system call), we ask for all process group members to be killed in a single step. This will effectively kill all 3 sleep processes.

Of course if some of the descendant processes ignore or handle the TERM signal, they won't get killed. We could have sent the KILL signal to forcibly kill all process group members. Sending the INT signal is similar to the TERM signal, except that Bash explicitly sets up ignoring the INT signal for processes started in the background (sleep 53 in the example above), so those processes won't get killed unless they restore the signal handler to the default (by calling the system call signal(SIGINT, SIG_DFL)). sleep doesn't do that, so sleep 53 wouldn't get killed if the INT signal was sent.

Using shell=True above is not essential: the idea of setting up a process group and killing all process group members in a single step works with shell=False as well.

See The Linux kernel: Processes for a ground-up lesson about processes, process groups, sessions and controlling terminals on Linux. This is a must-read to gain basic understanding, because the relevant system call man pages (e.g. man 2 setpgrp) don't define or describe these basic concepts.

The same idea works through SSH, with a few modifications. There is no need for preexec_fn=os.setpgrp, because sshd sets up a new process group for each new connection. We need to do extra work though to get the remote PID, because p.pid would just return the local PID of the short-living first ssh process. Here is a working example:

import subprocess, time
ssh_target = '127.0.0.1'
command = 'sleep  53 & sleep  54 | sleep  56'
p = subprocess.Popen(('ssh', '-T', '--', ssh_target,
                      'echo $$; exec >/dev/null; (%s\n)&' % command),
                     stdout=subprocess.PIPE)
pid = p.communicate()[0]  # Read stdin (containing the remote PID).
assert not p.wait()
pid = int(pid)
time.sleep(.3)
print 'killing'
assert not subprocess.call(('ssh', '-T', '--', ssh_target,
                            'kill -TERM %d' % -pid))

In the SSH example above, kill is usually the built-in kill command of Bash, but it works equally well if it's the /bin/kill external program. The remote PID (same as the remote PGID) is written by echo $$.

The exec >/dev/null command is needed to redirect stdout. Without it the ssh client would wait for an EOF on stdout before closing the connection. This can be useful in some cases, but in our design p.wait() is called early, and the first SSH connection is expected to close early, not waiting for the command to finish.

1 comment:

Marcin Kulik said...

Great write up on the problem. Thanks, it helped to solve similar problem I was having. Cheers!