2010-11-28

Announcing uevalrun: self-contained computation sandbox for Linux

This blog post is to announce uevalrun.

uevalrun is a self-contained computation sandbox for Linux, using User-mode Linux for both compilation and execution of the program to be sandboxed. The program can be written in C, C++, Python, Ruby, Perl or PHP. uevanrun enforces memory limits, timeouts and output size limits in the sandbox. The primary use case for uevalrun is evaluation of solution programs submitted by contestants of programming contests: uevalrun compiles the solution, runs it with the test input, compares its output against the expected output, and writes a status report.

For your convenience, here is a (non-updated) copy of the documenation of uevalrun. See project home page for the most up-to-date version.

Installation

A 32-bit (x86, i386) or 64-bit (x86_64, amd64) Linux system is required with enough RAM for both compilation and execution, plus 6 MB memory overhead. The Linux distribution doesn't matter, because uevalrun uses statically linked binaries, and it's self-contained: it contains all the tools it needs, including the C and C++ compilers (from the uClibc gcc-4.1.2 toolchain).

uevalrun doesn't need root privileges: it runs as a simple user.

uevalrun needs about 160MB of disk space, most of which is consumed by the C compiler (31MB extracted + 17MB compressed), the scripting language interpreters (9MB compressed), and the virtual disk images (77MB, uncompressed). After removing all temporary files, 80MB will be enough.

Download and compile:

$ svn co http://pts-mini-gpl.googlecode.com/svn/trunk/uevalrun uevalrun
$ cd uevalrun
$ ./make  # This downloads some more files during compilation.

Try it (the italic parts are displayed by the program):

$ (echo '#! perl'; echo 'print "Hello"x2, "\n"') >foo.pl
$ echo HelloHello >expected.txt
$ ./uevalrun -M 10 -T 9 -E 9 -s foo.pl -t /dev/null -e expected.txt
...
@ result: pass
$ echo '/**/ int main() { return!printf("HelloHello\n"); }' >foo.c
$ ./uevalrun -M 10 -T 9 -E 9 -U 19 -N 32 -C 9 \
  -s foo.c -t /dev/null -e expected.txt
...
@ result: pass

How to use

This section has not been written yet. If you have questions, don't hesitate to ask!

Requirements

Security is the most important requirement of uevalrun, followed by high performance and usability.

The sandboxing must be secure, more specifically:

  • Programs inside the sandbox must not be able to communicate with the outside world, except for their stdin and stdout. So they don't have access to the filesystem (except for some read-only access to some whitelisted system libraries and binaries), and they can't do network I/O. If uevalrun is used for programming contest submission evaluation, this restriction prevents the program from finding and reading the file containing the expected output.
  • Sandboxed programs must not be able to use more system resources (memory, disk, CPU) than what was allocated for them.
  • Sandboxed programs must not be running for a longer period of time than what was allocated.
  • Even the compilation of programs to be run (executed) inside the sandbox must be sandboxed (possibly with different system resource allocation), to prevent the attacker from submitting a program source code which exploits a bug in the compiler.
  • The sandbox must be reverted to its initial state before each compilation and execution, so sandboxed programs won't be able to gather information about previous compilations and executions. For programming contests, this restriction prevents the program from reading other contestants' submissions or their output.
  • Timeouts (both real time and user time) must be enforced outside the sandbox, so the program will be reliably killed if it runs for too long time, even if it tampers with time measurement inside the sandbox.
  • Sanboxed programs must not be able to lie about their success, their performance or the correctness of their output by writing special marker characters to their stdout or stderr.

The sandbox must be fast and waste few resources, more specifically:

  • Reverting the sandbox to its initial, empty state must be very fast (preferably faster than 1 second).
  • Starting a program inside the sandbox must be very fast (preferably faster than 1 second).
  • Running a program inside the sandbox must not be much slower than running the same program outside the sandbox. If the program is CPU-intensive, it may run 1.1 times slower (i.e. 10% slower) inside than outside. If the program is I/O-intensive, it may run 10 times slower inside than outside. This requirement is intentionally quite loose on I/O virtualization performance.
  • Running a program inside the sandbox must not need more than 10MB more memory than running the same program outside.
  • Running a program inside the sandbox must not need any disk space, except for the disk space needed by the program binary, the test input and the expected output, all stored as files outside the sandbox.
  • Compilation inside the sandbox must require only a reasonable amount of temporary disk space (for the file to be compiled, the temporary files, and the output binary).

The sandbox must be easy to use and versatile, more specifically:

  • Sandboxed programs must be able to accept any 8-bit binary input (stdin).
  • Sandboxed programs must be able to write any 8-bit binary output to their stdout.
  • Multiple sandboxed programs must be able to run at the same time on the same host system, without affecting each other.
  • If a sandboxed program fails (e.g. because it writes to its stdout different from what was expected, or it does a memory access violation, it reaches a timeout, it exceeds its allocated memory etc.), the proper failure reason has to be reported (e.g. ``wrong answer'' must not be reported if the program times out or vice versa).
  • The sandboxing software must have as little few system dependencies as possible. It must be able to run in a restricted (e.g. chroot) environment, it must not depend on system libraries or configuration. It must work on 32-bit (x86, i386) and 64-bit (x86_64, amd64) Linux systems, on any Linux distirubution.
  • Sandboxed programs can be written in C, C++, Python, Ruby, Perl and PHP, or in any language whose compiler can produce a 32-bit (i386) Linux statically linked executable binary.

Design

To fullfill the requirements, the following fundamental building blocks are used.

User-mode Linux (UML) is used for virutalization: both compilation and execution is performed in a UML guest, reading data from the host system using virtual block devices (ubd), and writing its output to its /dev/tty1 (con0), which can be read by a process on the host. A UML guest kernel (currently 2.6.31.1) tailered to the sandboxing requirements (security and performance) is configured and compiled. Some kernel patches are applied to increase security, reliability and performance. (Please note that these patches apply to the UML guest kernel only, so the host remains unpatched, and rebooting is not needed either.) Networking and the virtual file system are disabled in the UML guest kernel for increased security. All unnecessary drivers and features are removed from the UML guest kernel to get fast boot times and to reduce the overhead.

The underlying virtualization technology used by UML is ptrace(), which doesn't need root privileges or a kernel module or kernel modifications in the host. As an alternative to UML, Seccomp could be used for sendboxing, but that's quite cumbersome, because the sandboxed process cannot allocate memory for itself (see the Google Chrome Seccomp sandbox for a generic solution), but it's still prohibitively cumbersome to sandbox GCC this way (with its requirements for forking and creating temporary files). Most sandboxing approaches on Linux require root privileges for a short amount of time (for chroot, PID namespaces (see clone(2)) and other namespaces). Most virtualization approaches (like KVM, Xen, VirtualBox, except possibly for QEMU) need loading of kernel modules or patching the kernel, and support only slow guest boot times.

With UML, it's possibly to boot the UML guest, run a hello-world binary, and halt the UML guest in less than 0.02 second (sustained). For that speed, one needs a (guest) kernel patch (included and automatically applied in uevalrun) which skips the Calibrating delay loop kernel startup step, which usually takes 0.33 second. The memory overhead of UML is also quite low (6 MB with a stripped-down kernel).

All software running inside and outside UML is written in C (for high performance) and compiled with uClibc and linked statically to avoid depending on the libc on the host, and to avoid the need for installing a libc in the guest. (The total size of these custom-built binaries is less than 100kB.) There is no Linux distribution installed the guest – only a few custom binaries are copied, like a custom /sbin/init, which mounts /proc, sets up file descriptors and resource limits, and starts the sandboxed program, and then halts the guest. The /lib directory in the guest is empty, except for compilation, where /lib contains libc.a, libstdc++.a etc.

BusyBox (statically linked with uClibc) is used as a Swiss Army Knife tool for scripting and automation (of the build and the setup process), both inside and outside UML. Please note, however, that once all the binaries are built and the filesystem images are created, BusyBox is not used anymore for running sandboxed programs and evaluating their output (except for temporary filesystem creation) because of performance reasons, but work is done by the small and fast custom C tools just compiled.

Sandboxed program sources can be written in C (using a large subset of uClibc as the system library), C++ (using uClibc and libstdc++), Python (using all the built-in Python modules), Ruby (using just the built-in Ruby modules and classes which are implemented in C), Perl (using only the few dozen most common modules), or PHP (using just the built-in PHP functions which are implemented in C).

For C and C++ compilation, a GCC 4.1.2 toolchain is used which is based on uClibc and produces statically linked executables.

Interpreters for Python, Ruby, Perl and PHP are provided as precompiled, self-contained, single, statically linked, 32-bit (i386) Linux executables. It's easy to copy them to the /bin directory of the UML guest.

All tools used for building the software are either shipped with the software, or the software downloads them during compilation. All binary tools are statically linked, 32-bit (i386) Linux executables. This provides maximum host system compatibility and reproducible behavior across systems.

In our speed measurements, CPU-intensive programs running inside UML run 1% slower than outside UML, but I/O-intensive programs (such as C++ compilation with GCC) can be about 6 times slower.

The Minix filesystem is used in the UML guests for both read-only filesystems (e.g. the root filesystem with the scripting language interpreters) and read-write filesystems (such as /tmp for the temporary assembly files created by GCC). The Minix filesystem was chosen instead of ext2 and other usual Linux filesystem, because it has less size overhead, it's codebase is smaller, and it has equally good performance for the small and mostly read-only filesystems it is used for.

License

uevanrun has been released under the GNU GPL v2.

Bugs? Problems? Contact the author. Send feedback!

Please add your feedback to the issue tracker. All feedbacks are welcome.

2010-11-19

How to download Skype without having to register

Recently Skype has introduced the requirement of registration or logging in on its web page before downloading the Windows version of their Skype software. This blog post explains how to download Skype without registration or logging in.

Follow the link relevant to your operating system:

2010-11-15

Remote desktop sharing for Linux without software installation and firewall configuration

This blog post explains how to share two desktops (screen, keyboard and mouse) without software installation (on Linux) and firewall configuration. The instructions given have been verified on Ubuntu Lucid running on 32-bit (i386) and 64-bit (amd64) architecture -- but they should work on any modern Linux desktop.

TeamViewer is an excellent cross-platform application which provides desktop sharing, file transfer, chat and video conferencing, and it's free for personal use. Its use is straightforward. Installers for Linux, Mac OS X, Windows and iPhone are downloadable here.

If you don't want to install anything, you can try the QuickSupport edition of TeamViewer on Windows and the Mac OS X. (Get it from the same download page.) The QuickSupport edition doesn't let the user initiate a connection (to become a client), but it listens as a server waiting for connections.

As of now, there is no official Linux QuickSupport edition of TeamViewer, so I've created one; it downlads and starts the regular TeamViewer for Linux application, without installing it. To use it on a vanilla Ubuntu Lucid box, just press Alt-, and type (or copy-paste) the following command, and press Enter:

bash -c 'wget -O- goo.gl/k9imm | bash'

Please note that the O between the dashes is a capital O. All other characters are lowercase.

Remote desktop sharing for Linux without software installation and firewall configuration

This blog post explains how to share two desktops (screen, keyboard and mouse) without software installation (on Linux) and firewall configuration. The instructions given have been verified on Ubuntu Lucid running on 32-bit (i386) and 64-bit (amd64) architecture -- but they should work on any modern Linux desktop.

TeamViewer is an excellent cross-platform application which provides desktop sharing, file transfer, chat and video conferencing, and it's free for personal use. Its use is straightforward. Installers for Linux, Mac OS X, Windows and iPhone are downloadable here.

If you don't want to install anything, you can try the QuickSupport edition of TeamViewer on Windows and the Mac OS X. (Get it from the same download page.) The QuickSupport edition doesn't let the user initiate a connection (to become a client), but it listens as a server waiting for connections.

As of now, there is no official Linux QuickSupport edition of TeamViewer, so I've created one; it downlads and starts the regular TeamViewer for Linux application, without installing it. To use it on a vanilla Ubuntu Lucid box, just press Alt-, and type (or copy-paste) the following command, and press Enter:

bash -c 'wget -O- goo.gl/k9imm | bash'

Please note that the O between the dashes is a capital O. All other characters are lowercase.

2010-11-10

How to disable PulseAudio on Ubuntu Lucid and Oneiric without uninstalling it

This blog post explains how to disable PulseAudio on Ubuntu Lucid without uninstalling it. The solution described here may work on other Linux distributions as well. Please note that you may lose volume controls integrated to GNOME (e.g. the volume control hotkeys), and you'll have to adjust the volume with alsamixer or it's GNOME equivalent, gnome-alsamixer.

Install the volume control programs:

$ sudo apt-get install alsa-utils gnome-alsamixer
$ sudo apt-get remove gamix  # Applications / Sound & Video / ALSA Mixer

Stop the PulseAudio server (pulseaudio) and remove temporary files:

$ pkill -STOP '^pulseaudio$'
$ rm -rf ~/.pulse

Make sure all application attempts to connect to the PulseAudio server and/or to start a new PulseAudio server will fail:

$ mkdir -p ~/.pulse
$ (if grep -q \=10 /etc/lsb-release; then
     echo 'default-server = 0.0.0.1'
   else
     echo 'daemon-binary = /dev/null/no-daemon'
   fi
   echo 'autospawn = no') >~/.pulse/client.conf

Kill the PulseAudio server:

$ pkill -KILL '^pulseaudio$'

Optionally, to disable PulseAudio for all users, add the following lines to /etc/pulse/client.conf for Ubuntu Lucid:

default-server = 0.0.0.1
autospawn = no
The corresponding lines for Ubuntu Oneiric are:
daemon-binary = /dev/null/no-daemon
autospawn = no

Restart your web browsers (for the Flash Player) and your Skype, and possibly other applications which use sound. To do so, it's best to log out and log in again.

2010-11-03

How to get rid of the GNOME panel menu delay on Ubuntu Lucid

This blog post gives instructions how to get rid of the GNOME panel menu delay, i.e. the small delay one experiences when moving the mouse pointer over the System menu, and the submenus like Preferences and Administration don't open immediately. The solution shown has been tested on Ubuntu Lucid, but it might work on other systems as well.

Run the following command to configure newly started applications to omit the delay:

$ echo "gtk-menu-popup-delay = 0" >> ~/.gtkrc-2.0

Run this command to restart the GNOME panel currently running:

$ killall gnome-panel
If you are interested speeding up the loading of icons, see more information on http://www.ubuntugeek.com/how-to-make-gnome-menus-faster-in-ubuntu.html.