pts.blog: How to make smaller C and C++ binaries

This blog post presents several techniques to make the binaries resulting from C or C++ compilation smaller with GCC (or Clang). Please note that almost all techniques are tradeoffs, i.e. a smaller binary can be slower and harder to debug. So don't use the techniques blindly before understanding the tradeoffs.

The recommended GCC (and Clang) flags:

Use -s to strip debug info from the binary (and don't use -g).
Use -Os to optimize for output file size. (This will make the code run slower than with -O2 or -O3.
Use -m32 to compile a 32-bit binary. 32-bit binaries are smaller than 64-bit binaries because pointers are shorter.
In C++, use -fno-exceptions if your code doesn't use exceptions.
In C++, use -fno-rtti if your code doesn't use RTTI (run-time type identification) or dynamic_cast.
In C++, use -fvtable-gc to let the linker know about and remove unused virtual method tables.
Use -fno-stack-protector .
Use -fomit-frame-pointer (this may make the code larger on amd64).
Use -ffunction-sections -fdata-sections -Wl,--gc-sections . Without this all code from each needed .o file will be included. With this only the needed code will be included.
For i386, use -mpreferred-stack-boundary=2 .
For i386, use -falign-functions=1 -falign-jumps=1 -falign-loops=1 .
In C, use -fno-unwind-tables -fno-asynchronous-unwind-tables . Out of these, -fno-asynchronous-unwind-tables makes the larger difference (can be several kilobytes).
Use -fno-math-errno, and don't check the errno after calling math functions.
Try -fno-unroll-loops, sometimes it makes the file smaller.
Use -fmerge-all-constants.
Use -fno-ident, this will prevent the generation of the .ident assembler directive, which adds an identification of the compiler to the binary.
Use -mfpmath=387 -mfancy-math-387 to make floating point computations shorter.
If you don't need double precision, but float preecision is enough, use -fshort-double -fsingle-precision-constant .
If you don't need IEEE-conformat floating point calculations, use -ffast-math .
Use -Wl,-z,norelro for linking, which is equivalent to ld -z norelro .
Use -Wl,--hash-style=gnu for linking, which is equivalent to ld --hash-style=gnu . You may also try =sysv instead of =gnu, sometimes it's smaller by a couple of bytes. The goal here is to avoid =both, which is the default on some systems.
Use -Wl,--build-id=none for linking, which is equivalent to ld --build-id=none .
Get more flags from the Os list in diet.c of diet libc, for about 15 architectures.
Don't use these flags: -pie, -fpie, -fPIE, -fpic, -fPIC. Some of these are useful in shared libraries, so enable them only when compiling shared libraries.

Other ways to reduce the binary size:

Run strip -S --strip-unneeded --remove-section=.note.gnu.gold-version --remove-section=.comment --remove-section=.note --remove-section=.note.gnu.build-id --remove-section=.note.ABI-tag on the resulting binary to strip even more unneeded parts. This replaces the gcc -s flag with even more aggressive stripping.
If you are using uClibc or diet libc, then additionally run strip --remove-section=.jcr --remove-section=.got.plt on the resulting binary.
If you are using uClibc or diet libc with C or C++ with -fno-exceptions, then additionally run strip --remove-section=.eh_frame --remove-section=.eh_frame_ptr on the resulting binary.
After running strip ... above, also run sstrip on the binary. Download sstrip from ELF Kickers, and compile it for yourself. Or get the 3.0a binary from here.
In C++, avoid STL. Use C library functions instead.
In C++, use as few template types as possible (i.e. code with vector<int> and vector<unsigned> is twice as long as the code with vector<int> only).
In C++, have each of your non-POD (plain old data) classes an explicit constructor, destructor, copy-constructor and assignment operator, and implement them outside the class, in the .c file.
In C++, move constructor, destructor and method bodies outside the class, in the .c file.
In C++, use fewer virtual methods.
Compress the binary using UPX. For small binaries, use upx --brute or upx --ultra-brute . For large binaries, use upx --lzma . If you have large initialized arrays in your code, make sure you declare them const, otherwise UPX won't compress them.
Compress the used libraries using UPX.
If you use static linking (e.g. gcc -static), use uClibc (most convenient way: pts-xstatic or diet libc (most convenient way: the included diet tool) or musl (most convenient way: the included musl-gcc tool) instead of glibc (GNU C library).
Make every function static, create a .c file which includes all other .c files, and compile that with gcc -W -Wall. Remove all code to which the compiler says is unused. Last time this saved about 9.2 bytes per function for me.
Don't use __attribute__((regparm(3))) on functions, it tends to make the code larger.
If you have several binaries and shared libraries, consider unifying the binaries into a single one (using symlinks and distinguishing in main with argv[0]), and moving the library code to the binary. This is useful, because the shared libraries use position-independent code (PIC), which is larger.
If it's feasible, rewrite your C++ code as C. Once it's C, it doesn't matter if you compile it with gcc or g++.
If your binary is already less than 10 kilobytes, consider rewriting it in assembly, and generating the ELF headers manually, see the tiny ELF page for inspiration.
If your binary is already less than 10 kilobytes, and you don't use any libc functions, use a linker script to generate tiny ELF headers. See the tarball with the linker script.
Drop the --hash-style=... flag passed to ld by gcc. To do so, pass the -Bmydir flag to gcc, and create the executable mydir/ld, which drops these flags and calls the real ld.
See more flags and ideas in this answer.

pts.blog

Flattr this blog

Blog Archive

Recommended

2013-12-25

How to make smaller C and C++ binaries

1 comment: