2012-11-15

How to start and kill a Unix process tree

This blog post explains how to start a (sub)process on Unix, and later kill it so that all its children and other descendants are also killed. The sample implementation is written in Python 2.x (≥ 2.4), but it works equally well in other programming languages. It has been tested on Linux, but it should also work on other Unix systems such as *BSD and the Mac OS X.

For example, the following doesn't work as expected:

import os, signal, subprocess, time
p = subprocess.Popen('cat /dev/zero | wc', shell=True)
time.sleep(.3)
os.kill(p.pid, signal.SIGTERM)

The intention here is to kill the long-running command after 0.3 second, but this will kill the shell only, and both cat /dev/zero (the long-running pipe source) and wc will continue running, both of them consuming 100% CPU until killed manually. The trick for killing the shell and all descendant processes is to put them into a process group first, and to kill all processes in the process group in a single step. Doing so is very easy:

import os, signal, subprocess, time
p = subprocess.Popen('sleep  53 & sleep  54 | sleep  56',
                     shell=True, preexec_fn=os.setpgrp)
time.sleep(.3)
print 'killing'
os.kill(-p.pid, signal.SIGTERM)
p.wait()

By specifying preexec_fn=os.setpgrp we ask the child to call the setpgrp() system call (after fork() but before exec...(...)ing the shell), which will create a new process group and put the child inside it. The ID of the process group (PGID) is the same as the process ID (PID) of the child. All descendant processes of the child will be added to this process group. By specifying a negative PID to os.kill (i.e. the kill(...) system call), we ask for all process group members to be killed in a single step. This will effectively kill all 3 sleep processes.

Of course if some of the descendant processes ignore or handle the TERM signal, they won't get killed. We could have sent the KILL signal to forcibly kill all process group members. Sending the INT signal is similar to the TERM signal, except that Bash explicitly sets up ignoring the INT signal for processes started in the background (sleep 53 in the example above), so those processes won't get killed unless they restore the signal handler to the default (by calling the system call signal(SIGINT, SIG_DFL)). sleep doesn't do that, so sleep 53 wouldn't get killed if the INT signal was sent.

Using shell=True above is not essential: the idea of setting up a process group and killing all process group members in a single step works with shell=False as well.

See The Linux kernel: Processes for a ground-up lesson about processes, process groups, sessions and controlling terminals on Linux. This is a must-read to gain basic understanding, because the relevant system call man pages (e.g. man 2 setpgrp) don't define or describe these basic concepts.

The same idea works through SSH, with a few modifications. There is no need for preexec_fn=os.setpgrp, because sshd sets up a new process group for each new connection. We need to do extra work though to get the remote PID, because p.pid would just return the local PID of the short-living first ssh process. Here is a working example:

import subprocess, time
ssh_target = '127.0.0.1'
command = 'sleep  53 & sleep  54 | sleep  56'
p = subprocess.Popen(('ssh', '-T', '--', ssh_target,
                      'echo $$; exec >/dev/null; (%s\n)&' % command),
                     stdout=subprocess.PIPE)
pid = p.communicate()[0]  # Read stdin (containing the remote PID).
assert not p.wait()
pid = int(pid)
time.sleep(.3)
print 'killing'
assert not subprocess.call(('ssh', '-T', '--', ssh_target,
                            'kill -TERM %d' % -pid))

In the SSH example above, kill is usually the built-in kill command of Bash, but it works equally well if it's the /bin/kill external program. The remote PID (same as the remote PGID) is written by echo $$.

The exec >/dev/null command is needed to redirect stdout. Without it the ssh client would wait for an EOF on stdout before closing the connection. This can be useful in some cases, but in our design p.wait() is called early, and the first SSH connection is expected to close early, not waiting for the command to finish.

2012-11-10

How to prevent GNU Screen from switching screens upon termination and detach

This blog post explains how to prevent GNU Screen from restoring the original terminal window contents upon a termination or a detach. The original terminal window contents is how the terminal window looked like before attaching. This is useful if a short-running command is running in Screen and you are interested in its final output.

Run this once:

(echo; echo 'termcapinfo * ti:te') >>~/.screenrc

Screens attached after running this won't restore the original terminal window contents.

Explanation: ti is the termcap equivalent of the terminfo entry smcup, and te is the termcap equivalent of the terminfo entry rmcup. They describe how to switch to the alternate screen and back. Setting them to empty will prevent Screen from switching to the alternate screen upon attach, and switching back to the normal screen upon termination and detach. Read more about these directives on http://chenyufei.info/blog/2011-12-15/prevent-vim-less-clear-screen-on-exit/.

Unfortunately, Screen still moves the cursor to the last line, and writes the [screen is terminating] message by scrolling down to the next line. This can't be disabled, it's hardwired to the GotoPos(0, D_height - 1); call in FinitTerm() function in display.c . Printing many newlines when starting the Screen session improves the situation a bit, because it moves the cursor down.

Bitwise tricks in Java

This blog post shows some tricky uses of bitwise operators in C, mostly to compute some functions quickly, processing many bits at a time in a parallel way.

See also Bitwise tricks in C for the same functions in C and C++.

See more tricks like this in the GNU Classpath source of java.lang.Integer and java.lang.Long. Check the methods bitCount, rotateLeft, rotateRight, highestOneBit, numberOfLeadingZeros, lowestOneBit, numberOfTrailingZeros, reverse.

See even better tricks in Chapter 2 of Hacker's Delight, in Bit Twiddling Hacks and in HAKMEM.

public class BitTricks {
  public static boolean is_power_of_two(int x) {
    return (x & (x - 1)) == 0;
  }

  public static boolean is_power_of_two(long x) {
    return (x & (x - 1)) == 0;
  }

  public static int bitcount(int x) {
    x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
    x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
    x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F);
    x = (x & 0x00FF00FF) + ((x >> 8) & 0x00FF00FF);
    x = (x & 0x0000FFFF) + (x >>> 16);
    return x;
  }

  public static int bitcount(long x) {
    x = (x & 0x5555555555555555L) + ((x >>  1) & 0x5555555555555555L);
    x = (x & 0x3333333333333333L) + ((x >>  2) & 0x3333333333333333L);
    x = (x & 0x0F0F0F0F0F0F0F0FL) + ((x >>  4) & 0x0F0F0F0F0F0F0F0FL);
    x = (x & 0x00FF00FF00FF00FFL) + ((x >>  8) & 0x00FF00FF00FF00FFL);
    x = (x & 0x0000FFFF0000FFFFL) + ((x >> 16) & 0x0000FFFF0000FFFFL);
    return (int)x + (int)(x >>> 32);
  }

  public static int reverse(int x) {
    x = (x & 0x55555555) <<  1 | ((x >> 1) & 0x55555555);
    x = (x & 0x33333333) <<  2 | ((x >> 2) & 0x33333333);
    x = (x & 0x0F0F0F0F) <<  4 | ((x >> 4) & 0x0F0F0F0F);
    x = (x & 0x00FF00FF) <<  8 | ((x >> 8) & 0x00FF00FF);
    x =                x << 16 | x >>> 16;
    return x;
  }

  public static long reverse(long x) {
    x = (x & 0x5555555555555555L) <<  1 | ((x >>  1) & 0x5555555555555555L);
    x = (x & 0x3333333333333333L) <<  2 | ((x >>  2) & 0x3333333333333333L);
    x = (x & 0x0F0F0F0F0F0F0F0FL) <<  4 | ((x >>  4) & 0x0F0F0F0F0F0F0F0FL);
    x = (x & 0x00FF00FF00FF00FFL) <<  8 | ((x >>  8) & 0x00FF00FF00FF00FFL);
    x = (x & 0x0000FFFF0000FFFFL) << 16 | ((x >> 16) & 0x0000FFFF0000FFFFL);
    x =                x << 32 | x >>> 32;
    return x;
  }

  public static int floor_log2(int x) {
    x |= x >>> 1;
    x |= x >>> 2;
    x |= x >>> 4;
    x |= x >>> 8;
    x |= x >>> 16;
    return bitcount(x) - 1;
  }

  public static int floor_log2(long x) {
    x |= x >>> 1;
    x |= x >>> 2;
    x |= x >>> 4;
    x |= x >>> 8;
    x |= x >>> 16;
    x |= x >>> 32;
    return bitcount(x) - 1;
  }
}

Bitwise tricks in C

This blog post shows some tricky uses of bitwise operators in C, mostly to compute some functions quickly, processing many bits at a time in a parallel way.

See also Bitwise tricks in Java for the same functions in Java.

See more tricks like this in the GNU Classpath source of java.lang.Integer and java.lang.Long. Check the methods bitCount, rotateLeft, rotateRight, highestOneBit, numberOfLeadingZeros, lowestOneBit, numberOfTrailingZeros, reverse.

#include <stdint.h>

See even better tricks in Chapter 2 of Hacker's Delight, in Bit Twiddling Hacks and in HAKMEM. char is_power_of_two_32(uint32_t x) { return !(x & (x - 1)); } char is_power_of_two_64(uint64_t x) { return !(x & (x - 1)); } int bitcount_32(uint32_t x) { x = (x & 0x55555555) + ((x >> 1) & 0x55555555); x = (x & 0x33333333) + ((x >> 2) & 0x33333333); x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F); x = (x & 0x00FF00FF) + ((x >> 8) & 0x00FF00FF); x = (x & 0x0000FFFF) + (x >> 16); return x; } int bitcount_64(uint64_t x) { x = (x & 0x5555555555555555LL) + ((x >> 1) & 0x5555555555555555LL); x = (x & 0x3333333333333333LL) + ((x >> 2) & 0x3333333333333333LL); x = (x & 0x0F0F0F0F0F0F0F0FLL) + ((x >> 4) & 0x0F0F0F0F0F0F0F0FLL); x = (x & 0x00FF00FF00FF00FFLL) + ((x >> 8) & 0x00FF00FF00FF00FFLL); x = (x & 0x0000FFFF0000FFFFLL) + ((x >> 16) & 0x0000FFFF0000FFFFLL); x = (x & 0x00000000FFFFFFFFLL) + (x >> 32); return x; } uint32_t reverse_32(uint32_t x) { x = (x & 0x55555555) << 1 | ((x >> 1) & 0x55555555); x = (x & 0x33333333) << 2 | ((x >> 2) & 0x33333333); x = (x & 0x0F0F0F0F) << 4 | ((x >> 4) & 0x0F0F0F0F); x = (x & 0x00FF00FF) << 8 | ((x >> 8) & 0x00FF00FF); x = x << 16 | x >> 16; return x; } uint64_t reverse_64(uint64_t x) { x = (x & 0x5555555555555555LL) << 1 | ((x >> 1) & 0x5555555555555555LL); x = (x & 0x3333333333333333LL) << 2 | ((x >> 2) & 0x3333333333333333LL); x = (x & 0x0F0F0F0F0F0F0F0FLL) << 4 | ((x >> 4) & 0x0F0F0F0F0F0F0F0FLL); x = (x & 0x00FF00FF00FF00FFLL) << 8 | ((x >> 8) & 0x00FF00FF00FF00FFLL); x = (x & 0x0000FFFF0000FFFFLL) << 16 | ((x >> 16) & 0x0000FFFF0000FFFFLL); x = x << 32 | x >> 32; return x; } int floor_log2_32(uint32_t x) { x |= x >> 1; x |= x >> 2; x |= x >> 4; x |= x >> 8; x |= x >> 16; return bitcount_32(x) - 1; } int floor_log2_64(uint64_t x) { x |= x >> 1; x |= x >> 2; x |= x >> 4; x |= x >> 8; x |= x >> 16; x |= x >> 32; return bitcount_64(x) - 1; }

2012-11-04

How to use Unicode accented characters in PuTTY in UTF-8 mode

This blog post explains how to set up PuTTY to connect to a Linux server in UTF-8 mode, so that all Unicode characters (including symbols and accented characters) will be transferred and interpreted correctly.

Follow these steps:

  • Download and install the newest PuTTY (0.62 or later).
  • Before connecting, configure PuTTY like this:
    • Window → Translation → Remote character set: UTF-8
    • Connection → Data → Environment variables: add Variable: LC_CTYPE, Value: en_US.UTF-8 , and click on Add.
    • Save these settings in Session (e.g. in Session → Default Settings → Save).
  • Connect to the SSH server with Putty.
  • If everything (including typing, reading and copy-pasting non-ASCII Unicode characters) already works, stop here.
  • Run the following commands without the leading $ (you will need the root password). These commands set up the UTF-8 en_US locale on the server. The commands have been verified and found working on Debian and Ubuntu servers.
    $ if test -f /etc/locale.gen; then sudo perl -pi -e 's@^# *(en_US.UTF-8 UTF-8)$@$1@' /etc/locale.gen; grep -qxF 'en_US.UTF-8 UTF-8' /etc/locale.gen || (echo; echo 'en_US.UTF-8 UTF-8') >>/etc/locale.gen; fi
    $ if test -f /var/lib/locales/supported.d; then grep -qxF 'en_US.UTF-8 UTF-8' || (echo; echo 'en_US.UTF-8 UTF-8') >>/var/lib/locales/supported.d/en; fi
    $ sudo perl -pi -e 's@^(?=LC_CTYPE=|LC_ALL=)@#@' /etc/environment
    $ sudo /usr/sbin/locale-gen
    $ sudo /usr/sbin/update-locale LC_CTYPE LC_ALL
    
  • Close the SSH connection window and open a new connection.
  • Verify that everything (including typing, reading and copy-pasting non-ASCII Unicode characters) works.

Contrary to the information found elsewhere on the net, just setting Window → Translation → Received data assumed to be in which character set or Window → Translation → Remote character set to UTF-8 is not always enough. Setting the server-side environment variables (e.g. setting LC_CTYPE to en_US.UTF-8 above) properly is also required (unless they are already correct). Generating the UTF-8 locale definitions (using locale-gen) on the server is also required (unless they are already generated).