7 min read
I don’t claim to be a master at linking and ELF (Linux) executables, but there’s some tricks I’ve learned that I wish someone had explained to me back then.
There’s two problems to resolve to make clean distributable binaries: dependency hell and the libc compatibility. By solving both, we can get an executable to run on any recent Linux system, regardless of the distribution and installed packages.
Compiling to a binary is a two-step process. There’s the actual compiling, then the linking. In the first step, each source file gets turned into an Object file (.o extension). Then all the .o and the .a files are put together in a binary and the ELF meta-information is added. .a files are static library files. If your code uses an external library and that library has previously been statically compiled, then the .a file can be directly embedded into the binary. The ELF headers contain data such as “where is the main?” and “where should I look for the dynamically linked libraries (.so files)?”
Back in the old UNIX days, the thought of embedding .a files into binaries was considered a bit crazy. Even if a library is only 50Kb, it would be duplicated into ALL of the executables on your system where it is used. It was also thought that if executables relied on a central .so file, then that file could be updated with bug fixes once and all binaries would benefit without having to be recompiled. Dynamic linking means that on startup, an executable will look for the required .so files on your system. In practice, the massive complexity incurred by dynamic linking makes it a nightmare for binary distribution. You might have an application that needs version 0.9.x and another one that needs 0.10.6, and now you’re stuck in “dll hell” on Linux. They might not be backward or forward compatible. Package managers exist in part to track dependencies like that and make sure that all packages installed through it agree on the version that a shared dependency will have. But for people that need to distribute an executable that will work on ANY Linux system it means we just can’t rely on the central libraries.
It also means we can’t ask our users to apt-get install or yum install or even “manually compile” anything because it could clash with another version currently installed that other programs rely on. So what about statically embedding everything into the binary? We could go through every dependency our program has and manually modify their Makefile or autoconf or CMakeList.txt, etc, to make them output .a files that we can then use. It’s painful, but it works, except… if our dependencies use .so dependencies themselves -and they frequently do- then it doesn’t work and we’re still stuck with dynamic linking.
In my opinion, the easiest and safest way to fix all these problems is to make our dynamic dependencies behave like static ones by shipping the .so files and making the binary use them over the local ones. First, I compile everything using the defaults. Then I examine the resulting binary with the ldd command.
The dependencies that start with /usr/ are those that will need to be shipped, except for libstdc++ because that’s another can of worms. Notice how ldd couldn’t find one of the dependencies? This is what happens if I were to run the executable:
patchelf --set-rpath '$ORIGIN/lib/' myexecutable
That solves the first problem. The executable will look for its dependencies in the local lib folder first before looking anywhere else on the system.
Libc is the C standard library. It provides an interface between the code and the kernel. Virtually every binary will dynamically link against it, even non-C ones. The problem is that libc is 100% forward compatible, but absolutely not backwards compatible. In other words, binaries compiled on a system with libc version 2.13 will work on any system with version 2.13 and up, but will not work at all on any lower versions. On Reddit, NotUniqueOrSpecial mentioned that symbol versioning can also be used to solve this exact problem.
Some will say that it’s a bad idea, because it can lock up the event loop for relatively long periods of time (up to 200ms for us in the most extreme cases). This is irrelevant to this codebase, which serves no data back to clients and acts as a data sink. All it does is process data and it needs to do it as cheaply as possible.
It’s not a technique for every performance problem there is, but it solved ours brilliantly.