## 2014-08-16

### On generating integer-sided triangles with an integer area/perimeter ratio

This blog post contains come comments about finding all triangles whose sizes are integers and the area/perimeter ratio is the integer r. This is related to Project Euler problem 283.

The blog post contains a nice analysis and some pseudocode for generating all possible (a, b, c) size triplets for the given ratio r. Unfortunately both the analysis and the pseudocode contains some mistakes.

Here is the corrected version:

```def find_triangles_correct(r):
for u in divisors of 2r:
for v in 1 to floor(sqrt(3) * u):
if gcd(u, v) == 1:
lhs = 4*r*r*(u*u+v*v)
for d1 in positive divisors of lhs:
if d1 <= lhs / d1:
d2 = lhs / d1
if not (u < v and d1*u < (2r(v*v-u*u))):
if ((d1+2ru) mod v) = 0 and ((d2+2ru) mod v) = 0:
a = (d1 + 2ru) / v + 2rv / u
b = (d2 + 2ru) / v + 2rv / u
c = (d1 + d2 + 4ru) / v
yield a, b, c```
Some mistakes and other inaccuracies in the original:
• The variable r were silently renamed to m.
• The formulas for a, b and c were not included.
• The analysis incorrectly states that v must be smaller than floor(sqrt(3)*u)). The correct statement is: v must be smaller than sqrt(3)*u. Before the correction some solutions such as (6, 8, 10) for r=1 were not generated.
• The condition d1 <= lhs / d1 had a >= instead.

Please note that the algorithm yields each triangle once, in an unspecified order of a, b and c within the triangle.

## 2014-08-12

### How the lifetime of temporaries is extended because of references in C++

This blog post summarizes what I learned about C++ today about how the lifetime of temporaries is extended because of references.

Normally a temporary is destroyed at the end of the statement, for example temporary string returned by `GetName()` lives until the end of the line defining `name`. (There is also the return value optimization which may eliminate the temporary.)

```#include <stdio.h>

#include <string>
using std::string;
string GetName() {
return string("foo") + "bar";
}
int main() {
string name = GetName();
printf("name=(%s)\n", name.c_str());
return 0;
}```

However, it's possible to extend the life of the temporary by assigning it to a reference-typed local variable. In that case, the temporary will live until the corresponding reference variable goes out of scope. For example, this will also work and it's equivalent to above:

```#include <stdio.h>
int main() {
const string &name = GetName();
printf("name=(%s)\n", name.c_str());
return 0;
}```

Please note that there is no lifetime extension if the function returns a reference, and also in some other cases related to function boundaries. See more details about reference initialization.

## 2014-07-29

### How to discard committer info from git history

This blog post explains how to overwrite the git committer name, git committer e-mail and git committer date in previous commits in a git repository. This is useful e.g. after a series of `git commit -C ...` calls, which copy the author name, e-mail and date from the specified commit, but use the committer name, e-mail and date from the environment of the command.

Please note that rewriting git history is potentially dangerous because it can lead to data loss and synchronization issues with others who pull from the repository. Read more about it in Git Tools – Rewriting History.

First make sure you don't have any uncommitted changes: `git status -s` shouldn't print anything.

Then checkout the relevant branch, and run this command on a Unix systems (or within Git Bash on Windows), without the leading `\$`:

```\$ git filter-branch -f --commit-filter '
GIT_COMMITTER_NAME="\$GIT_AUTHOR_NAME"
GIT_COMMITTER_EMAIL="\$GIT_AUTHOR_EMAIL"
GIT_COMMITTER_DATE="\$GIT_AUTHOR_DATE"

Repeat running this command in each local branch you care about. For each branch, it will rewrite the history of the entire branch from its roots.

If you are sure that you didn't have a data loss, then run:

`\$ rm -rf .git/refs/original`

## 2014-07-08

### Programming tips for students

This blog post gives you advice to make the best use of your time for learning how to program and enjoying programming in your pre-college years.

### General, technology-independent recommendations

The best situation is if you have an experienced programming mentor, or some friends or schoolmates who are a bit ahead of you, and you have the chance to observe what these people do, and can ask questions. If that's the case, you are probably better off observing and asking these people than to follow the advice below.

If you are an absolute beginner, read Message to the students first. It contains many links where you can start learning how to program.

If you or your school can afford buying books, see the list of recommended books. The home or school library may have outdated and moderately useful course material, it's worth getting and reading the best books instead settling with what's already available.

Learn English to the extent that you can read and understand everything related to programming, and you can ask technical questions clearly in written form. If you have several languages to start learning, always pick English as the first language. It's much more useful to improve your English skills to fluent (so you can give a presentation to an audience of 1000 people in English just as easily as in your mother tongue) than to improve your skills in any other language from basic to medium.

Learn typing quickly, and without looking at the keyboard. Z-Type is a nice way to practice.

Get a large screen (at least full HD resolution is strongly recommended) and a comfortable keyboard and mouse. Try many different keyboards and mice -- most of them are cheap. (Cheap, wired ones can be much better than expensive ones.) Pick one keyboard layout (US English or your local one) and stick to it.

Learn how to ask questions so you get the answer you need quickly. Start learning from How To Ask Questions the Smart Way.

There are many great websites which teach you programming for free, even if you have no prior experience. See a list of them in Message to the students.

To become a skilled and experienced programmer, practice, i.e. write lots of code. Start that early. A minute spent on writing code is better spent than an hour of reading a book or a web page about programming.

If you get stuck, ask a question on StackOverflow. It's free. You will get several answers in 5 minutes from experienced programmers, you can pick the best answer, or ask for clarification. To modify specific behavior of specific web pages, write JavaScript which modifies the DOM, and use Greasemonkey with Firefox to run it each time to visit the page. Google Chrome can also run Greasemonkey scripts.

If you need something, write a small program for (parts of) it. That's easy to start on Linux and shell scripts and Python.

Start using a version control system (e.g. Git, Subversion) as soon as you have written a program of at least 100 lines of code. (But definitely start after 1000 lines.) The rule of thumb is: if you (or others) worked more than 15 minutes on it in total, then add the code to version control system. Start early with using a version control system, don't procrastinate. Put every program file you write to a repository managed by a version control system. Commit often, about once per hour, but definitely at least once per day. Make backups regularly (at least once per week) of everything you write. The simplest way to do it is to copy everything to an external hard drive, and disconnect the hard drive after the copy has finished. Using online backup services and version control systems (e.g. Git, Subversion) also serves as an additional backup for some of the code you write — but also make a traditional backup.

Participate in programming contests. You can practice for Google Code Jam online. For less competitive, and easier to solve problems, see Project Euler. topcoder also has great problems and a great community.

Arrange your windows on screen so that they don't overlap at all. Cycling through overlapping windows contribute a lot to lost productivity. Use virtual desktops. On Linux, use a tiling window manager if necessary. Edit your program in one window, and see the results in another window (e.g. web browser, command-line, PDF viewer). Make both windows visible at the same time (without overlap). Once set up, don't cycle through them, don't resize them and don't move them: just see them both at the same time, and work in either one. If necessary, create more than 2 windows (maybe 2 * 2 or 3 * 2), without overlap.

### Recommended sites, products and technologies

Install Linux as (try to) use it as your main operating system. Linux is very much customizable and scriptable (all the command-line tools described in the book The Unix Programming Environment are available). Be careful, installing another operating system (such as Linux) may lead to data loss and may render your computer temporarily unusable (but an expert can fix that). So make a full backup first, or install Linux to an otherwise unused computer. If Linux is not an option, try to use a Mac instead, and install free software from MacPorts. The Mac OS X is a variant of Unix, with the tools described in the book The Unix Programming Environment available.

To get quick and high quality answers to your technical questions: StackOverflow and other StackExchange sites.

To host the source code of your open-source projects (including a version control system, a wiki, an issue tracker and a code review tool): github and Google Code project hosting. There are many other providers, so compare the features and the code ownership, publicity and copyright issues before choosing.

Free C++ IDE: Code::Blocks.

Free code editor for HTML and scripting languages (Python, PHP, Ruby, JavaScript, Perl, TCL, XML, HTML, CSS): Komodo Edit.

Use GNU Make or some other incremental build tool to make sure that the relevant parts of your program gets recompiled.

### Recommended books for programming students

This blog post lists the best books I can recommend to students who want to learn how to program. Please note that programming is more about doing than reading, see Message to the students for a more general picture.

Basics not tied to a specific programming language:

• Absolute beginners should follow the Message to the students guide first, and finish some online courses about programming.
• Rivest et al.: Introduction to Algorithms, 3rd edition.
• Donald E. Knuth: The Art of Computer Programming, Volume 1, Volume 2, Volume 3.
• Brian W. Kernighan and Rob Pike: The Unix Programming Environment, 1984.
• Steve McConnell: Code Complete, 2nd edition.

Basics for some important programming languages and technologies::

Gems specific to programming languages and technologies beyond the basics:

• C++: Scott Meyers: Effective C++, 3rd edition.
• Java: Joshua Bloch: Effective Java, 2nd edition.
• Java: Joshua Bloch and Neal Gafter: Java Puzzlers, 2005.

## 2014-06-20

### Message to the students

Dear Student,

If you are too busy reading through the rest, then try these web resources for learning programming:

To start with programming, play Lightbot first (it takes a few hours). There is also an iPhone and and Android app. Much of programming consists of sitting in front of a computer and typing the program. Learn touch typing (i.e. typing quickly without looking at the keyboard). Z-Type is a nice game in which you can practice typing English. Decide on a keyboard layout, and learn where the special symbols are (e.g. # and {). If typing is still slow and frustrating for you, don't give up: after a few months of practice, your fingers learn where the keys are. In the meantime, you can use Scratch to write some spectacular programs (even simple games) using mostly the mouse, with very little typing. Scratch can do 2D (2 dimensional) graphics only. Blocky is an alternative of Scratch. Try it if Scratch doesn't work well on your computer or it looks too complicated. If you are interested in 3D, use AgentCubes. Unfortunately it's not as user friendly as Scratch, it's more clunky, but it's still enjoyable.

Programming is a creative activity, mostly at the computer. The immediate output is the program code you type, and the final output is the program which runs, and makes people's life easier or more fun. (It's OK if it's fun only for you in the beginning.) The other aspect of it is computer science, which is done mostly at the whiteboard and reading books, mostly thinking about how a computation can be done can be done as quickly as possible for large inputs. (A typical computation: you have to visit 30 cities in a week (and do some activity in each of them), you know the distances between each 2 cities, and you want to make your trip as short as possible. In real-life problems there can be thousands of cities.) This is very similar to the non-geometry parts of math, so if you enjoy solving math puzzles, than probably you would also enjoy learning and doing computer science. You can start with the Principles of Computing lecture on Coursera, and then try the computer science section of Khan Academy. The best classic books in the topic are The Art of Computer Programming (start with the first 3 volumes) and Introduction to Algorithms. If you open these books, you'll see lots of math formulas and lots of graphs (with nodes, edges and labels) in them. That's normal, these are the everyday tools of a computer scientist.

If you like nice, eye-candy design with music, then Code Combat will teach you JavaScript in a fantasy (medieval) atmosphere: you have to rescue kidnapped town members from prisons guided by trolls etc. A similar website is Ruby Warrior, you can practice the Ruby language there, but it's more enjoyable if you have some prior programming experience, so don't start there. Both Code Combat and Ruby Warrior are quite resource-hungry, so you need a very modern and fast computer to play smoothly, and you have to close most other programs.

HTML (for text and structure), CSS (for visual layout) and JavaScript (for interactive elements such as animation and reaction to mouse clicks) are the 3 most important languages today to build websites: the code written in those languages are executed (interpreted) by the web browsers, and that code will determine how the web page will look and sound like and how it will behave. (There are many other programming languages for the server side, i.e. to make the website remember your data when you visit it from a different computer or browser.) As mentioned above, learn these 3 languages from Codecademy. You can also practice your CSS skills in a playful way on Flukeout (but you have to learn it first somewhere else). If you want to build a website, save it and show it off to your friends, Thimble is a nice and easy-to-use tool for that. It supports HTML, CSS and JavaScript. You have to register and log in first in order to be able to save your creation.

When real programmers work and write code, they usually work in a text-editor or an IDE (integrated development environment), and not in a web browser. To make it easy to start for you, most of the learning material mentioned above needs only a web browser with internet access. After you have finished one or two programming classes above, you should give it a try with the real text editor. There are step-by-step instructions for that in Python in the Programming with Python (do this first) and Data Processing with Python courses. Just follow the instructions, it tells you what to install and how to use it.

Once you get good at programming (i.e. you've finished many programming courses, and when you start the next one, you find it too easy), and enjoy it a lot, you should try your skills (and possibly win some prizes) in one of the programming contests. For example, there is Google Code Jam, you can start practicing any time with any Round 1 (or Round 1A). It works best if 3 or more of you work together, trying to solve one of the problems together, and then you go on separately for the 2nd problem, and at the end if anyone has a correct solution, they show the others how they did it. You can also ask your teacher if there are other programmings contest closeby where you live, and if there are some preparation classes you can attend. As soon as you know the basics of programming, the most you can learn from is solving contest problems and understanding other people's analyses and solutions. Project Euler is similarly useful, but it contains more math-oriented and theoretical problems. After solving the first few easy problems, you'll need your best math, programming and computer science skills to solve the rest.

Good luck and have fun!

### What's the difference between StaticPython and PyRun?

This blog post explains the difference between StaticPython and PyRun.

Both StaticPython and PyRun are free, portable binary distributions of Python 2.x and 3.x, for Linux, Mac OS X (and PyRun also works on FreeBSD without Linux emulation). That is, a user without root access can quickly download, extract and run either StaticPython or PyRun, and it won't conflict with the (possibly old) Python versions installed to the system (by root). Windows users should use Portable Python instead.

An important difference is that StaticPython doesn't need any external files to run on Linux, while PyRun needs lots of system libraries (libssl.so.1.0.0, libcrypto.so.1.0.0, libz.so.1, libsqlite3.so.0, libz2.so.1, libpthread.so.0, libdl.so.2, libutil.so.1, libm.so.6, libc.so.6), and will not work on systems which don't have all of the required libraries installed (with the required version), so PyRun is much less portable.

Another important difference is that with PyRun you can compile and install C extensions, and with StaticPython you can't (even ctypes doesn't work with StaticPython). However, many extensions are precompiled: Stackless + greenlet + libssl + Tokyo Cabinet + libevent2 + Concurrence + AES + Syncless + MessagePack + Gevent MySQL + PyCrypto.

Another important difference is that PyRun integrates nicely with packaging (setuptools, pip etc.), and is a nice, self-contained replacement for virtualenv. StaticPython doesn't help you with packaging, you have to do Python library installation and distribution manually.

There are many small differences, e.g. PyRun is available for both 32-bit and 64-bit, ans StaticPython is 32-bit only. PyRun needs at least 14 megabytes of disk space (10 megabytes for the binary executable and about 4 megabytes for the mandatory system library dependencies), and StaticPython needs between 6 and 9 megabytes depending on which feature set you need.

My recommendation: If PyRun works on your system (because you have the right version of system libraries installed, and you have enough free disk space), then use PyRun, otherwise use StaticPython.

Caveat: I am the author of StaticPython.

## 2014-04-26

### How to parse an integer in C++ (and C) with overflow detection

This blog post explains how to write C++ code which parses integers, in a type-safe way for any integral type, paying attention to overflows and underflow. The code with small modifications can be used in C as well, C++ language features are only used to allow arbitrary integral types.

It sounds like we could use a standard library function. But apparently there isn't any that would work:

• Please note that sscanf(3) etc. don't do overflow detection, so we can't use that.
• Please note that atoi(3) etc. can't indicate a parse error, so we can't use that.
• Please note that strtol(3) etc. doesn't exist so it can't do overflow detection for anything smaller than a long, so we can't use that.
• Please note that sscanf, atoi and strtol support NUL-terminated strings only, they can't parse a string (and report the expected parse error) which contains an `'\0'`.

So let's implement our own parser. Someone might expect that this is a trivial and boring task, but it will turn out that there are many pitfalls, and many tricky ideas to avoid all of them. Let's see.

The first attempt:

```template<class T>bool parse_dec(char const *p, T *out) {
T n = 0;
while (*p + 0U - '0' <= 9U) {  // A digit.
n = 10 * n + *p++ - '0';
}
*out = n;
return true;  // Success.
}```

There is only 1 trick in the implementation: how to detect whether `*p` is a digit (i.e. between `'0'` and `'9'`. For characters with too large ASCII code, the result of `*p + 0U - '0'` would be larger than 9, so the condition is false. For characters with too small ASCII code, the result is negative, but because of `0U` it would be interpreted as unsigned, so it would be larger than `9U`, so the condition is false. We don't use the standard library function isdigit(3), because on some systems it returns true for bytes other than `'0'` ... `'9'` (see the link above).

Problems:

1. It doesn't check that the end of the string is reached at the end of the number.
2. It parses the empty string as 0.
3. It doesn't support negative numbers.
4. It doesn't check for overflow or underflow.
It's easy to fix problems 1 and 2 by restructuring the loop and checking for `'\0'`:
```template<class T>bool parse_dec(char const *p, T *out) {
T n = 0;
do {
if (*p + 0U - '0' > 9U) return false;  // Not a digit.
n = 10 * n + *p++ - '0';
} while (*p != '\0');
*out = n;
return true;  // Success.
}```
To fix problem 3 (negative numbers), we need a way to detect whether the type T is signed, and then parse a leading `-`:
```template<class T>bool parse_dec(char const *p, T *out) {
const bool is_signed = (T)~(T)0 < (T)0;
bool is_negative = false;
if (is_signed && *p == '-') {
is_negative = true;
++p;
}
T n = 0;
do {
if (*p + 0U - '0' > 9U) return false;  // Not a digit.
n = 10 * n + *p++ - '0';
} while (*p != '\0');
*out = is_negative ? -n : n;
return true;  // Success.
}```

To understand the initialization of is_signed above, consider that in C and C++, the expression `~0` generates an int with all bits 1, thus it's -1, thus it's negative. The expression `(unsigned)~0` is positive, it's the largest value for unsigned int. So by checking the sign of `(T)~(T)0` we can determine whether T is a signed or unsigned integral type. The double cast is necessary to make it work even if `sizeof(T)` is smaller or larger than `sizeof(int)`.

Please note that the `~` operator works for integral types, but it doesn't compile for pointer or floating point types etc. So if parse_dec is used with a wrong T, then a compile error would be reported at template instantiation time. To convert it to a template substitution failure, type_traits and enable_if can be used.

To fix problem 4 (overflow detection), we can add an if just before we assign the larger value to n. But how can we check for an overflow without triggering the overflow (and already losing some high bits from the result)? We can work like this: if we are parsing a positive, 8-bit signed int, then the maximum value is 127, so there is no overflow if the previous n is less than 12, or the previous n is 12, and the new digit is at most 7. That's exactly how the code below checks for overflow. It also calculates the upper limit of n (e.g. 12) and the new digit (e.g. 7) based on properties of T: whether it is signed, its size, and whether there was a `-` in the input.

```template<class T>bool parse_dec(char const *p, T *out) {
// ~(T)~(T)0 compiles for integral types, but it doesn't compile for pointer
// or floating point types etc.
const bool is_signed = (T)~(T)0 < (T)1;
const T nmax = (is_signed ? (T)~((T)1 << (sizeof(T)*8-1)) : (T)~(T)0) / 10;
bool is_negative = false;
if (is_signed && *p == '-') {
is_negative = true;
++p;
}
const char cmax = is_signed ? (is_negative ? 8 : 7) : 5;
T n = 0;
do {
if (*p + 0U - '0' > 9U) return false;  // Not a digit.
const char c = *p++ - '0';
const T nneg = -n;
if (nneg > nmax || (nneg == nmax && c > cmax)) return false;  // Overflow.
n = 10 * n - c;
} while (*p != '\0');
*out = is_negative ? n : -n;
return true;  // Success.
}```

It's interesting to note that the last allowed digit (cmax) doesn't depend on `sizeof(T)`, because the sequence containing of the last digit of powers of 2 is periodic with a period divisible by 8.

Please note that the upper limit signed integers is not symmetric to 0. For example, for uint8_t, the smallest allowed value is -128, and the largest one is 127. This difference is reflected in the initialization of cmax: 7 or 8.

Please note that we're updating n with the opposite sign, and we flip the sign only at the end. This is necessary to avoid signed integer overflow when parsing the smallest possible integer (e.g. -128 for 8-bit integers). Without the sign flip we'd temporarily have 128 in n, which is an overflow.

Please note that in each comparison in the code above we cast both sides explicitly to the same type. (In the `9U` comparison both sides are of type unsigned int.) This is to make the code easy to understand even if the reader is not familiar with the integer automatic conversion and sign extension rules in C and C++.

Please note that in C and C++ the result is undefined if there is a signed integer overflow in the arithmetic operations. That's OK for us, because we make sure that our arithmetic operations (`10 * n + c` and the boring ones) happen to never overflow: either we explicitly check for overflow before, or the input contains only small numbers.

The example code above can be changed easily if the input is not a NUL-terminated string, but any array of characters with a known size, or a `FILE*` (using getc), or an `std::istream` (using get). This is left as an exercise to the reader.

Similar code can be written for parsing hex and octal, and combining them to autodetect the base based on the prefix (i.e. `0x` or `0X` for decimal, and `0` for octal.

For completeness, here is what kind of numbers our last parser accepts: those non-overflowing integers which fully match the regexp `-[0-9]+` for signed T, and `[0-9]+` for unsigned T. So leading whitespace is not allowed, trailing junk is not allowed, multiple leading 0s are allowed, a leading `+` is not allowed, and a leading - is allowed iff T is signed.

See the full code with some usage examples on Github.

## 2014-04-10

### Announcing ssweb: single-shot webserver in Python

This blog post announces ssweb, a single-shot HTTP server library in Python 2.x. Single-shot means is that the webserver accepts only a single incoming HTTP connection, handles it, and then exits immediately.

Such a server can be used as a redirect target in the web-based OAuth authentication protocol, to receive the freshly generated OAuth token. In this case the program is a command-line tool running a HTTP server for a short amount of time, to make the passwordsless login more convenient for the user, without having to manually copy-paste token. If it doesn't sound useful, then it's most probably not useful for you right now.

Example usage:

```\$ wget -O ssweb.py http://raw.githubusercontent.com/pts/ssweb/master/ssweb.py
\$ python ssweb.py
URL: http://127.0.0.1:40464/foo
{'res_body': 'Shot.', 'req_head': 'GET /foo HTTP/1.0\r\nHost: 127.0.0.1:40464\r\nUser-Agent: Python-urllib/1.17\r\n\r\n', 'client_addr': ('127.0.0.1', 40026), 'res_head': 'HTTP/1.0 200 OK\r\nContent-Type: text/plain\r\nContent-Length: 5\r\n\r\n', 'req_body': ''}
'Shot.'
\$ python ssweb.py x
... (after visiting the URL in Firefox)
{'res_body': 'Shot.', 'req_head': 'GET /foo HTTP/1.1\r\nHost: 127.0.0.1:59872\r\nUser-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Language: en-US,en;q=0.5\r\nAccept-Encoding: gzip, deflate\r\nConnection: keep-alive\r\n\r\n', 'client_addr': ('127.0.0.1', 50702), 'res_head': 'HTTP/1.0 200 OK\r\nContent-Type: text/plain\r\nContent-Length: 5\r\n\r\n', 'req_body': ''}```

## 2014-04-04

### How to convert all YouTube links from http:// to https:// in Firefox history

This blog post explains how to convert the protocol from http:// to https:// in YouTube URLs in the browsing history of a Mozilla Firefox profile. YouTube has recently started redirecting logged-in users from http:// to https:// . This conversion is useful in the following situation:

• The user uses his Firefox browsing history to track which YouTube videos he has already seen.
• The user disables web site colors (e.g. using the PrefBar Firefox extension) so by looking at the color of a YouTube video link he can immediately see if he has visited that video page or not.
• The user uses some Greasemonkey scripts to normalize YouTube video links on web pages (e.g. to remove `&feature=related`, so the normalized URLs will be added to the Firefox browsing history.
• Bonus: The user uses the LinkVisitor Firefox extension to mass-toggle the visited status of URLs without actually visiting them in Firefox.

To do the protocol conversion from http:// to https:// , close Firefox and run this on Linux (without the leading `\$` sign):

```\$ sqlite3 ~/.mozilla/firefox/*.default/places.sqlite '
UPDATE OR IGNORE moz_places   SET url="https" || SUBSTR(url, 5) WHERE url LIKE "http://youtube.com/%" OR url LIKE "http://www.youtube.com/%";
UPDATE OR IGNORE moz_favicons SET url="https" || SUBSTR(url, 5) WHERE url LIKE "http://youtube.com/%" OR url LIKE "http://www.youtube.com/%";
ANALYZE;
VACUUM;
'```

The ANALYZE and VACUUM parts are optional, they just speed up Firefox accessing these tables in the future.

The SQLite UPDATE queries above can be run on the relevant places.sqlite file on Mac OS X or Windows as well. (This blog post doesn't provide explicit instructions how to run those queries. Use Google web search or ask on Stack Overflow to figure out how.)

## 2014-01-19

### How to run custom code before and after main in GCC?

This blog post explains how to register C or C++ code to be run at process startup (i.e. just before *main*) and process exit (e.g. when *main* returns or when *exit*(...) is called). Code to be run at loading and unloading of shared libraries and Linux kernel modules are not covered in this post.

The behavior described in this post has been tested with gcc-4.1 ... gcc-4.8 and clang-3.0 ... clang-3.4. Older versions of GCC and Clang may behave differently.

The behavior described in this post has been tested with (e)glibc-2.7 ... (e)glibc-2.15 and uClibc-0.9.30.1 ... uClibc-0.9.33). Earlier versions and other libc implementations may behave differently. For example, dietlibc-0.33 doesn't execute any of the registered code (so the example below prints just `MAIN`; `MYATEX2`; `MYATEX1`).

The new way (available since gcc-2.7) is declaring a function with attribute((constructor)) (this will make it run at process startup) and declaring a function with attribute((destructor)) (this will make it run at process exit). The double underscores around __attribute__ are there to prevent GCC warnings when compiling standard C (or C++) code. Example:

```#include <unistd.h>

__attribute__((constructor)) static void myctor(void) {
(void)!write(2, "HI\n", 3);
}

__attribute__((destructor)) static void mydtor(void) {
(void)!write(2, "BYE\n", 4);
}```

Upon specifying one of these attributes, GCC (or Clang) appends the function pointer to the sections .ctors or .dtors, respectively. (You can take a look at objdump -x prog to see if these sections are present.) The libc initialization and exit code will run all functions in these sections. There is a well-defined order (see below) in which these registered functions get run, but the order is within the same translation unit (C or C++ source file) only. It's undefined in which order the translation units are processed.

Please note that the process exit functions are not always called: for example, if the process receives a signal which terminates it (e.g. either from another process or from itself, or from itself, by calling abort()), or if the process calls _exit(...) (with an underscore), then none of the process exit functions are called.

Please note that it's possible to register more process exit functions at runtime, by calling atexit(3) or on_exit(3).

C++ static initialization is equivalent to attribute((constructor)):

```#include <unistd.h>
#include <string>

(void)!write(1, "GA\n", 3);
return 42;
}

/* The call to get_answer works in C++, but it doesn't work in C, because
* the value of myanswer1 is not a compile-time constant.
*/
std::string hello("World");  /* Registers both a constructor and destructor. */```

There is an older alternative for registering process startup and exit functions: by adding code to the body of the _init function in the .init section and to the body of _fini function in the .fini section. The headers of these functions are defined in crti.o and the footers are defined in crtn.o (both of which are part of the libc, use e.g. objdump -d .../crti.o to disassemble them). GCC itself uses this registration mechanism in crtbegin.o to register __do_global_dtors_aux and in crtend.o to register __do_global_ctors_aux.

It is possible to use this older registration alternative in your C or C++ code, but it's a bit inconvenient. Here are some helper macros which make it easy:

```/* Usage: DEFINE_INIT(my_init1) { ... }
* Defines function my_init1 which will be called at startup, before main().
* As a side effect, defines `static void name() { ... }'.
*/
#define DEFINE_INIT(name) \
static void name(void); \
/* If we declared this static, it wouldn't get called. */ \
__attribute__((section(".trash"))) void __INIT_HELPER__##name(void) { \
static void (* volatile f)(void) = name; \
__asm__ __volatile__ (".section .init"); \
f(); \
__asm__ __volatile__ (".section .trash"); \
} \
static void name(void)

/* Usage: DEFINE_FINI(my_fini1) { ... }
* Defines function my_fini1 which will be called at process exit.
* As a side effect, defines `static void name() { ... }'.
*/
#define DEFINE_FINI(name) \
static void name(void); \
/* If we declared this static, it wouldn't get called. */ \
__attribute__((section(".trash"))) void __FINI_HELPER__##name(void) { \
static void (* volatile f)(void) = name; \
__asm__ __volatile__ (".section .fini"); \
f(); \
__asm__ __volatile__ (".section .trash"); \
} \
static void name(void)```

For your reference, here are the corresponding much simpler macros for attribute((constructor)) and attribute((destructor)):

```/* Usage: DEFINE_CONSTRUCTOR(my_init1) { ... }
* Defines function my_init1 which will be called at startup, before main().
* As a side effect, defines `static void name() { ... }'.
*/
#define DEFINE_CONSTRUCTOR(name) \
__attribute__((constructor)) static void name(void)

/* Usage: DEFINE_DESTRUCTOR(my_init1) { ... }
* Defines function my_fini1 which will be called at process exit.
* As a side effect, defines `static void name() { ... }'.
*/
#define DEFINE_DESTRUCTOR(name) \
__attribute__((destructor)) static void name(void)```

It is possible to use the old and the new registration mechanisms at the same time. Here is a sample code which uses both, and C++ static initialization and atexit and on_exit as well.

```#include <string.h>
#include <unistd.h>
#include <stdlib.h>

#ifdef __cplusplus
class C {
public:
C(const char *msg): msg_(msg) {
(void)!write(1, "+", 1);  (void)!write(1, msg_, strlen(msg_));
}
~C() {
(void)!write(1, "-", 1);  (void)!write(1, msg_, strlen(msg_));
}
private:
const char *msg_;
};
#endif

DEFINE_INIT(myinit1) { (void)!write(1, "MYINIT1\n", 8); }
DEFINE_CONSTRUCTOR(myctor1) { (void)!write(1, "MYCTOR1\n", 8); }

#ifdef __cplusplus
static int get_answer(const char *msg) {
(void)!write(1, msg, strlen(msg));
return 42;
}
C myobj1("MYOBJ1\n");
C myobj2("MYOBJ2\n");
#endif

DEFINE_INIT(myinit2) { (void)!write(1, "MYINIT2\n", 8); }
DEFINE_CONSTRUCTOR(myctor2) { (void)!write(1, "MYCTOR2\n", 8); }
DEFINE_FINI(myfini1) { (void)!write(1, "MYFINI1\n", 8); }
DEFINE_DESTRUCTOR(mydtor1) { (void)!write(1, "MYDTOR1\n", 8); }
DEFINE_FINI(myfini2) { (void)!write(1, "MYFINI2\n", 8); }
DEFINE_DESTRUCTOR(mydtor2) { (void)!write(1, "MYDTOR2\n", 8); }
static void myatex1() { (void)!write(1, "MYATEX1\n", 8); }
static void myatex2() { (void)!write(1, "MYATEX2\n", 8); }
static void myonexit(int exitcode, void *arg) {
const char *msg = (const char*)arg;
(void)exitcode;
(void)!write(1, msg, strlen(msg));
}

int main(int argc, char **argv) {
(void)argc; (void)argv;
atexit(myatex1);
on_exit(myonexit, (void*)"MYONEX1\n");
(void)!write(1, "MAIN\n", 5);
atexit(myatex2);
on_exit(myonexit, (void*)"MYONEX2\n");
return 0;
}```

It is not intuitive in which order these are run. Here is the output:

```MYINIT1
MYINIT2
+MYOBJ1
+MYOBJ2
MYCTOR2
MYCTOR1
MAIN
MYONEX2
MYATEX2
MYONEX1
MYATEX1
-MYOBJ2
-MYOBJ1
MYDTOR1
MYDTOR2
MYFINI1
MYFINI2```

Please note that gcc-4.3 and below run MYDTOR1 and MYDTOR2 in the opposite order. All other compilers tested (see above which) use exactly this order. The order is libc-independent, because newer compiler versions with the same libc resulted in different order, while other libc versions with the same compiler version kept the order intact. Please note again that the order is undefined across translation units (C or C++ source files).

## 2014-01-15

### Announcing mplaylist: Audio playlist player using mplayer, with checkpointing

This blog post is the formal announcement of mplaylist, and audio playlist player using mplayer, with checkpointing.

mplaylist is Python script which can play audio playlists (.m3u files), remembering the current playback position (file and time) even when killed, so it will resume playback at the proper position upon restart. The playback position is saved as an .m3u.pos file next to the .m3u file. mplaylist uses mplayer for playing the audio files.

mplayer needs Python and a Unix system with mplayer installed. (It may be easy to port to Windows, but it has not been tried.) Download the script directly from here. There is no GUI. You have to start mplayback from the command-line, in a terminal window.

The reason why I wrote mplaylist is that I needed the following features and I couldn't easily find an audio player for Ubuntu which had all of them:

• It supports multiple playlists.
• It remembers the playback position (file and time) for each playlist.
• Preferably, it remembers playback position even when the process is killed.
• Lets the user adjust playback speed, without changing the pitch.

mplaylist supports all these features. Checkpointing (i.e. remembering the playback position) works even if both the mplayer and mplaylist processes are killed with kill -9 (SIGKILL). If you have a journaling filesystem with block device barriers properly set up, checkpointing also works if you unplug the power cable.

Please note that mplaylist is not only for music files. It works excellently for playing back (series of) audio books and (series of) talks.

## 2014-01-11

### How to prevent YouTube from using HTTPS

This blog post explains how to configure your web browser (Mozilla Firefox or Google Chrome) to prevent YouTube from redirecting from the http:// protocol to https://. The instructions below work no matter if you are logged in to YouTube.

YouTube has started doing this recently in the last couple of months, and also some browser extensions do it now. Please note that using HTTPS gives you more privacy (e.g. governments and internet service providers spying on you) than HTTP, so please think about it carefully if you want to revert to HTTP on YouTube or not.

Test the protocol: Type youtube.com to your address bar, make sure https:// doesn't show up why typing, and press Enter. Wait for the page to load. If you can't see https:// added to the beginning of the address, and you don't see a lock icon on the left side of the address, then we're done, stop.

If you have the Disconnect browser extension installed, disable it. (You may want to enable or reconfigure it later, after finishing these steps.) If Firefox asks for a browser restart, then restart it. Test the protocol.

If you have the YouTube Center browser extension or the corresponding Greasemonkey script installed, configure it by unticking the Use secure protocol checkbox. Test the protocol.

Remove (delete) all your YouTube cookies. In Chrome, copy-paste chrome://chrome/settings/content to the address bar, press Enter, click on the All cookies and site data... button, search for youtube, make sure that nothing unrelated shows up, and click on the Remove all button. In Firefox, open Edit / Preferences / Privacy / remove individual cookies, search for youtube.com, and click on the Remove all cookies button. Test the protocol.

If you're using Firefox on Linux, remove YouTube from the secure site table. To do it, exit from Firefox, and run the following command in a terminal window (without the leading \$):

`\$ sqlite3.static ~/.mozilla/firefox/*.default/permissions.sqlite "DELETE FROM moz_hosts WHERE type LIKE 'sts%' AND host LIKE '%youtube.com'"`

If you get an error message and you don't know how to fix it, or you are using Firefox on non-Linux, you can run the same DELETE FROM ... SQL query (between but without the double quotes above) using the SQLite Manager Firefox extension. Test the protocol.

Test the protocol. If it is still redirecting to https://, then take notes which of your browser extensions are enabled, disable all your browser extensions, and restart the browser. Test the protocol. If it's not redirecting anymore, then enable your browser extensions one-by-one, and figure out which one is the culprit. (There may be multiple ones.) Keep the culprit disabled or change its settings.

If it is still redirecting with all your extensions disabled, then this howto can't help you, try to find a solution on the web, and/or ask a question on webapps.stackexchange.com. Don't forget to reenable your browser extensions.

Some anecdotes: on Firefox, deleting the cookies solve the problem for me, and on Chrome disabling Disconnect solved the problem for me.

## 2014-01-10

### How to remove almost all files from a Git repository

This blog post explains how to remove all files (including their history) from a Git repository, except for files in a whitelist. This can be useful to split a Git repository to two smaller repositories.

This can lead to a data loss, so make sure you have a backup of the repository. Also read the basics about rewriting history and git filter-branch first.

Here is the command which keeps only the files foo and bar/baz (type it without the leading \$):

```\$ (export KEEP="\$(echo 'foo'; echo 'bar/baz')";
NL="\$(echo;echo x)"; export NL="\${NL%x}"; git filter-branch -f \
--index-filter 'X="\$IFS"; IFS="\$NL";
set -- \$(git ls-files | grep -vFx "\$KEEP");
IFS="\$X"; test \$# -gt 0 &&
git rm --cached --ignore-unmatch -- "\$@"; :' --prune-empty HEAD)```

This needs a Bourne-compatible shell, so it won't work out-of-the-box in the Windows command-line, but it will work on most modern Unix systems.

This looks like unnecessarily complex, elaborate and bloated, but all the little tricks are necessary to make it work with files with funny characters in their name and with all modern Bourne-compatible shells. (Only newline and apostrophe (`'`) won't work.)

To keep empty commits, omit the --ignore-unmatch flag.

Please note that if the files you are interested in were renamed, then this command doesn't recognize old names of the files: you have to enumerate the old pathnames explicitly to keep them.

To do the other way round, i.e. to keep all files except foo and bar/baz, do this:

```\$ (export KEEP="\$(echo 'foo'; echo 'bar/baz')";
NL="\$(echo;echo x)"; export NL="\${NL%x}"; git filter-branch -f \
--index-filter 'X="\$IFS"; IFS="\$NL"; set -- \$KEEP;
IFS="\$X"; test \$# -gt 0 &&
git rm --cached --ignore-unmatch -- "\$@"; :' --prune-empty HEAD)```

## 2014-01-05

### A short file size comparison of small libc implementations for Linux

This blog post gives a short executable file size comparison when the same statically linked, i386 ELF executable was compiled with various small (tiny) libc implementations for Linux.

TL;DR diet libc is producing the smallest executables.

Compiler used: GCC 4.6.3 in Ubuntu Precise.

libc implementations used:

All file sizes are the size of statically linked, Linux i386 ELF, stripped executable, except for source file (where it is the size of a .c source file) and dynamic (where it is the size of a dynamic executable of the same kind).

The source file size reducing compiler flags and tricks in this blog post were used. The programs used dynamic memory allocation (malloc(3), free(3), realloc(3)), system call I/O (e.g. read(2) and write(2)), but none of the printf*(3) functions or stdio.

Compilation results for clang_trampoline.c:

• source file: 37889 bytes
• diet libc: 15176 bytes
• dynamic: 17644 bytes
• musl: 22420 bytes
• uClibc: 22580 bytes
• static: 709120 bytes

Compliation results for xstatic.c:

• source file: 30410 bytes
• diet libc: 12316 bytes
• dynamic: 13516 bytes
• musl: 18992 bytes
• uClibc: 19412 bytes
• static: 705024 bytes

Interesting observation: the diet libc version is smaller than the dynamic version. That's because linking against dynamic shared libraries has its own overhead (e.g. symbol table, PLT) in the executable.

### Announcing pts-xstatic: A tool for creating small, statically linked Linux i386 executables with any compiler

This blog post announces pts-xstatic, a convenient wrapper tool for compiling and creating portable, statically linked Linux i386 executables. It works on Linux i386 and Linux x86_64 host systems. It wraps an existing compiler (GCC or Clang) of your choice, and it links against uClibc and the other base libraries included in the pts-xstatic binary release.

See the most recent README for all details.

C compilers supported: gcc-4.1 ... gcc-4.8, clang-3.0 ... clang-3.3. C++ compilers supported: g++ and clang++ corresponding to the supported C compilers. Compatible uClibc C and C++ headers (.h) and precompiled static libraries (e.g. libc.a, libz.a, libstdc++.a) are also provided by pts-xstatic. To minimize system dependencies, pts-xstatic can compile with pts-clang (for both C and C++), which is portable, and you can install it as non-root.

As an alternative of pts-xstatic, if you want a tiny, self-contained (single-file) for Linux i386, please take a look at pts-tcc. With pts-xstatic, you can create faster and smaller statically linked executables, with the compiler of your choice.

As an alternative for pts-xstatic and uClibc, see diet libc and its diet tool (which is an alternative of the xstatic tool), with which you can create even smaller binaries.

### Motivation

1. Available uClibc GCC toolchain binary releases are very old, e.g. the i686 release contains gcc-4.1.2 compiled on 2009-04-11.
2. With uClibc Buildroot, the uClibc version is tied to a specific GCC version. It's not possible to compile with your favorite preinstalled C or C++ compiler version, and link against your favorite uClibc version. pts-xstatic makes this possible.
3. libstdc++ is not easily available for uClibc, and it's a bit cumbersome to compile. pts-xstatic contains a precompiled version.

### Minimum installation

If you want to install try pts-xstatic quickly, without root access, without installing any dependencies, and without changing any settings, this is the easiest way:

```\$ cd /tmp
\$ rm -f pts-xstatic-latest.sfx.7z
\$ wget http://pts.50.hu/files/pts-xstatic/pts-xstatic-latest.sfx.7z
\$ chmod +x pts-xstatic-latest.sfx.7z
\$ ./pts-xstatic-latest.sfx.7z -y  # Creates the pts-xstatic directory.
\$ rm -f pts-clang-latest.sfx.7z
\$ wget http://pts.50.hu/files/pts-clang/pts-clang-latest.sfx.7z
\$ chmod +x pts-clang-latest.sfx.7z
\$ ./pts-clang-latest.sfx.7z -y  # Creates the pts-clang directory.
\$ cat >>hw.c <<'END'
#include <stdio.h>
int main(void) {
return !printf("Hello, %s!\n", "World");
}
END
\$ pts-xstatic/bin/xstatic pts-clang/bin/clang -s -O2 -W -Wall hw.c && ./a.out
Hello, World!
\$ strace -e open ./a.out
Hello, World!
\$ file a.out
a.out: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped
\$ ls -l a.out
-rwxr-xr-x 1 pts pts 16888 Jan  2 23:17 a.out```
Compare the file size with statically linking against regular (e)glibc:
```\$ gcc -static -m32 -o a.big -s -O2 -W -Wall hw.c && ./a.big
Hello, World!
\$ strace -e open ./a.big
Hello, World!
\$ file a.big
a.big: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.24, BuildID[sha1]=0x37284f286ffeecdb7ac5d77bfa83ade4310df098, stripped
\$ ls -l a.big
-rwxr-xr-x 1 pts eng 684748 Jan  2 23:20 a.big```

FYI with diet libc, the generated a.out file is only 8668 bytes long.

See full installation instructions in the most recent README.

### Does pts-xstatic create portable executables?

pts-xstatic creates portable, statically linked, Linux ELF i386 executables, linked against uClibc. By default, these executables don't need any external file (not even the file specified by argv[0], not even the /proc filesystem) to run. NSS libraries (the code needed for e.g. getpwent(3) (getting info of Unix system users) and gethostbyname(3) (DNS resolution)) are also included. The executables also work on FreeBSD in Linux mode if the operating system field in the ELF header frm SYSV to Linux.

As an alternative to pts-xstatic: gcc -static (or clang -static) doesn't provide real portability, because for calls such as getpwent(3) (getting info of Unix system users) and gethostbyname(3) (DNS resolution), glibc loads files such as libnss_compat.so, libnss_dns.so. On the target system those libraries may be incompatible with your binary, so you may get a segfault or unintended behavior. pts-xstatic solves this, because it uses uClibc.

It can be useful to embed locale files, gconv libraries, arbitrary data and configuration files needed by the program, Neither `gcc -static', pts-xstatic or statifier can do it, but Ermine can. Ermine is not free software, but you can get a free-of-charge time-limited trial, and you can ask for a discount for noncommercial use. See all details here, and give it a try!

See the most recent README for full installation instructions, usage details, full feature list etc.

## 2014-01-02

### How to detect integer overflow in C and C++ addition and subtraction

This blog post explains how to detect integer overflow (and underflow) in C and C++ addition and subtraction, and it also gives example code.

Overflow (or underflow, we use these terms interchangeably) occurs when the result of an arithmetic operation cannot be represented as an integer of the same type (and size) as the operands. For unsigned addition, overflow indicates that the result is too large. For unsigned subtraction, overflow indicates that the result is negative. For signed addition and subtraction, overflow indicates that the result is either too small or too large.

When chaining additions, it's useful to compute the sum x + y + c, where c is the carry bit (either 0 or 1) resulting from the previous, less significant addition. Similarly, when chaining subtractions, it's useful to compute the difference x - y - c, where c is the borrow bit (either 0 or 1) resulting from the previous, less significant subtraction.

The freely available Chapter 2 (Basics) of the book Hacker's Delight has a detailed and informative subsection about overflow processing. The formulas presented below are based on formulas in that section. Please read the entire section of the book for a detailed explanation and more formulas (which are useful in other environments).

One simple observation is that signed addition overflows iff the sign of the two operands (x and y) are the same, but it's different from the sign of the sum. Based on similar observations we can devise the following formulas:

• signed x + y + c overflows iff this is negative: `((x+y+c)^x)&((x+y+c)^y)`
• signed x + y + c overflows iff this is negative: `z&(((x^z)+y+c)^~y)` after `z=(x^~y)&((1<<sizeof(x)*8-1))` (no temporary overflow)
• signed x - y - c overflows iff this is negative: `((x-y-c)^x)&((x-y-c)^~y)`
• signed x - y - c overflows iff this is negative: `z&(((x^z)-y-c)^y)` after `z=(x^~y)&((1<<sizeof(x)*8-1))` (no temporary overflow)
• unsigned x + y + c overflows iff this is negative: `(x&y)|((x|y)&~(x+y+c))`
• unsigned x - y - c overflows iff this is negative: `(~x&y)|((~x|y)&(x-y-c))`

Please note that none of the formulas above contain branches, so the CPU pipeline doesn't have to flushed in order to compute them. To convert the sign bit (i.e. negativity) to a bool (0 or 1), shift it down like this: `(int)(((((x+y+c)^x)&((x+y+c)^y))>>(sizeof(x)*8-1))&1)`.

Please note that in standard C and C++ the result of addition and subtraction is undefined (!) if an overflow occurs. The GCC flags -fwrapv and -fno-strict-overflow disable this undefined behavior. But since our code can't be sure if it's compiled with these flags enabled, we must use an overflow-detection formula in which no temporary overflow occurs. Such formulas are also given above. Another option is casting the operands to the corresponding unsigned type, adding them as unsigned (which happens normally, only the least significant bits are kept, as many as possible), and then casting the result back to signed. To do so, we must add these explicit casts in `x+y+c` and `x-y-c` in the signed formulas above. These casts can get tricky if we don't know the type of the operands, because there is no overloaded generic cast in C (which e.g. casts int to unsigned and long long to unsigned long long).

See the final code on Github. It can be included as a .h file in C and C++ code. It works with GCC 4.1 and above and Clang 3.0 and above. It uses the GCC extension typedef (also works in Clang) and it uses function overloading in C++ for the generic unsigned cast. In C, it uses the GCC extension __builtin_choose_expr for this cast. It also uses href="http://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html">statement expressions in macro bodies to declare temporary variables to avoid useing the arguments more than once.

• About the C11 _Generic selections (for implementing overridden functions and macros) in this blog post.
• P99, a huge macro library for C99 (C dialects earlier than C11).
• An article about proper overflow detection in all C and C++ arithmetic operations. Overflow detection is much harder to do correctly than what you think. The article contains many incorrect naïve implementations, and also the correct (complicated) implementations. Read it, it's worth it!

## 2013-12-31

### How to implement in-place set intersection in C++

This blog post shows and explains C++ source code implementing in-place set intersection, i.e. removing each element from a set (or another sorted container) *bc which is not also a member of container bc.

The std::intersection function template in the <algorithm> header in C++ standard template library populates a new output set, adding all elements in the intersection into it. This can be too slow and a waste of memory if one of the inputs is not needed afterwards. In this case an in-place intersection is desired instead, but unfortunately such a function template is not part of the C++ standard template library.

Here is a simple in-place implementation which looks up each element of *bc in ac, and removes (erases) it from *bc if not found:

```#include <set>

// Remove elements from bc which are missing from ac.
//
// The time required is proportional to log(ac.size()) * bc->size(), so it's
// faster than IntersectionUpdate if ac is large compared to bc.
template<typename Input, typename Output>
static void IntersectionUpdateLargeAc(
const std::set<Input> &ac, std::set<Output> *bc) {
const typename std::set<Input >::const_iterator a_end = ac.end();
const typename std::set<Output>::const_iterator b_end = bc->end();
for (typename std::set<Output>::iterator b = bc->begin(); b != b_end; ) {
// Not a const_iterator, erase wouldn't accept it until C++11.
const typename std::set<Output>::iterator b_old = b++;
bc->erase(b_old);  // erase doesn't invalidate b.
} else {
++b;
}
}
}```

Removing from *bc above is a bit tricky, because we don't want to invalidate the iterator b. In C++11 erase returns a new iterator, which is just after the removed elements, but we don't use that just to be backwards-compatible. Instead of that we make use of the fact that iterators to the non-removed elements are kept intact for set, multiset and list, so we create the temporary iterator b_old, which will be invalidated, but b remains valid.

We need the typename keyword in local variable declarations, because they have a dependent type (i.e. a type whose identifier is within another type specified by a template parameter.)

The time complexity is O(log(as) · bs), so it is fast if ac is large when compared to *bc. For example, when as = 3k and bs = k, then it's O(k2).

As an alternative, we could iterate over the two sets in increasing (ascending) order at the same time, similarly to the merge operation (as implemented by std::merge in mergesort, but dropping elements from *bc if there is no corresponding element in ac. One possible implementation:

```#include <set>

// Remove elements from bc which are missing from ac.
//
// The time required is proportional to ac.size() + bc->size().
template<typename Input, typename Output>
static void IntersectionUpdate(
const std::set<Input> &ac, std::set<Output> *bc) {
typename std::set<Input>::const_iterator a = ac.begin();
const typename std::set<Input>::const_iterator a_end = ac.begin();
typename std::set<Output>::iterator b = bc->begin();
const typename std::set<Output>::iterator b_end = bc->end();
while (a != a_end && b != b_end) {
if (*a < *b) {
++a;
} else if (*a > *b) {
const typename std::set<Output>::iterator b_old = b++;
bc->erase(b_old);  // erase doesn't invalidate b.
} else {  // Elements are equal, keep them in the intersection.
++a;
++b;
}
}
bc->erase(b, b_end);  // Remove remaining elements in bc but not in ac.
}```

The time complexity of this above (IntersectionUpdate) is O(as + bs), which is faster than IntersectionUpdateLargeAc if ac is not much smaller than *bc. For example, when as = 3k and bs = k, then it's O(3k + k), so IntersectionUpdateLargeAc is faster.

Example usage of both (just to see if they compile):

```int main(int, char**) {
std::set<int> a, b;
IntersectionUpdateLargeAc(a, &b);
IntersectionUpdate(a, &b);
return 0;
}```

It's natural to ask if these function templates can be generalized to C++ containers other than set. They take advantage of the input being sorted, so let's consider sorted std::vector, sorted std::list and std::multiset in addition to std::set. To avoid the complexity of having to distinguish keys from values, let's ignore std::map and std::multimap.

The generalization of IntersectionUpdateLargeAc from set to multiset is trivial: no code change is necessary. The std::multiset::find operation returns any matching element, which is good for us. However, with IntersectionUpdate, the last `++a;` must be removed: without the removal subsequent occurrences of the same value in *bc would be removed if ac contains this value only once. No other code change is needed. It is tempting to introduce a loop in the previous (`*a > *b`) if branch:

```for (;;) {
const typename Output::iterator b_old = b++;
const bool do_break = b == b_end || *b_old != *b;
bc->erase(b_old);  // erase doesn't invalidate b.
if (do_break) break;
}```

However, this change is not necessary, because subsequent equal values in *bc would be removed in subsequent iterations of the outer loop.

Here are the full generalized implementations:

```#if __cplusplus >= 201103 || __GXX_EXPERIMENTAL_CXX0X__
#include <type_traits>
#endif

// Remove elements from bc which are missing from ac. Supported containers for
// bc: list (only if sorted), vector (only if sorted), set, multiset. Supported
// containers for ac: set, multiset.
//
// The time required is proportional to log(ac.size()) * bc->size(), so it's
// faster than IntersectionUpdate if ac is large compared to bc.
template<typename Input, typename Output>
static void IntersectionUpdateLargeAc(const Input &ac, Output *bc) {
#if __cplusplus >= 201103 || __GXX_EXPERIMENTAL_CXX0X__
// We could use std::is_convertible (both ways) instead of std::is_same.
static_assert(std::is_same<typename Input::value_type,
typename Output::value_type>::value,
"the containers passed to IntersectionUpdateLargeAc() need to "
"have the same value_type");
#endif
const typename Input::const_iterator a_end = ac.end();
const typename Output::const_iterator b_end = bc->end();
for (typename Output::iterator b = bc->begin(); b != b_end; ) {
// Not a const_iterator, erase wouldn't accept it until C++11.
const typename Output::iterator b_old = b++;
bc->erase(b_old);  // erase doesn't invalidate b.
} else {
++b;
}
}
}

// Remove elements from bc which are missing from ac. Supported containers for
// ac and bc: list (only if sorted), vector (only if sorted), set, multiset.
template<typename Input, typename Output>
static void IntersectionUpdate(const Input &ac, Output *bc) {
#if __cplusplus >= 201103 || __GXX_EXPERIMENTAL_CXX0X__
static_assert(std::is_same<typename Input::value_type,
typename Output::value_type>::value,
"the containers passed to IntersectionUpdate() need to "
"have the same value_type");
#endif
typename Input::const_iterator a = ac.begin();
const typename Input::const_iterator a_end = ac.end();
typename Output::iterator b = bc->begin();
// Can't be a const interator, similarly to b_old.
const typename Output::iterator b_end = bc->end();
while (a != a_end && b != b_end) {
if (*a < *b) {
++a;
} else if (*a > *b) {
const typename Output::iterator b_old = b++;
bc->erase(b_old);  // erase doesn't invalidate b.
} else {  // Elements are equal, keep it in the intersection.
// Don't do ++a, in case ac is a multiset.
++b;
}
}
bc->erase(b, b_end);  // Remove remaining elements in bc but not in ac.
}```

These work as expected for set, multiset and sorted list. It also doesn't require that the two containers are of the same kind. For C++0x and C++11, an extra static_assert is present in the code to print a helpful compact error message if the base types are different.

However, when *bc is a vector, we get a compile error, because in C++ older than C++11, std::vector::erase doesn't return an iterator (but it returns void). Even if we could get an iterator, b_end would be invalidated by erase, because it's behind it. This is easy to fix, we should use bc->end() instead of b_end everywhere. However, if we didn't make any other changes, the algorithm would be slower than necessary, because std::vector::erase moves each element behind the erased one. So the time complexity would be O(as + bs2). To speed it up, let's swap the to-be-removed elements with the element with the last element of the vector, and to the actual removal at the end of the function:

```#if __cplusplus >= 201103 || __GXX_EXPERIMENTAL_CXX0X__
#include <type_traits>
#include <utility>  // std::swap.
#else
#include <algorithm>  // std::swap.
#endif

// Template specialization for vector output.
template<typename Input, typename T>
static void IntersectionUpdate(const Input &ac, std::vector<T> *bc) {
#if __cplusplus >= 201103 || __GXX_EXPERIMENTAL_CXX0X__
static_assert(std::is_same<typename Input::value_type, T>::value,
"the containers passed to IntersectionUpdate() need to "
"have the same value_type");
#endif
typename Input::const_iterator a = ac.begin();
const typename Input::const_iterator a_end = ac.end();
typename std::vector<T>::iterator b = bc->begin();
// Elements between b_high an bc->end() will be removed (erased) right before
// the function returns. We defer their removal to save time.
typename std::vector<T>::iterator b_high = bc->end();
while (a != a_end && b != b_high) {
if (*a < *b) {
++a;
} else if (*a > *b) {
std::iter_swap(b, --b_high);  // Works even if swapping with itself.
} else {  // Elements are equal, keep them in the intersection.
++a;
++b;
}
}
bc->erase(b, bc->end());  // Remove remaining elements in bc but not in ac.
}```

Once we have the generic implementation and the special implementation for vector in the same file, the C++ compiler would take care of choosing the right (most specific) one depending on whether *bc is a vector or not. So all these work now:

```#include <list>
#include <set>
#include <vector>

int main(int, char**) {
std::set<int> s;
std::multiset<int> ms;
std::vector<int> v;
// std::list<unsigned> l;  // Won't work in C++0x and C++11.
std::list<int> l;
IntersectionUpdate(s, &ms);
IntersectionUpdate(ms, &v);
IntersectionUpdate(v, &l);
IntersectionUpdate(l, &s);
IntersectionUpdateLargeAc(s, &ms);
IntersectionUpdateLargeAc(ms, &v);
// IntersectionUpdateLargeAc(v, &l);  // v is not good as ac.
// IntersectionUpdateLargeAc(l, &s);  // l is not good as ac.
IntersectionUpdateLargeAc(s, &l);
IntersectionUpdateLargeAc(ms, &s);
return 0;
}```

The full source code is available on GitHub.