Things I’ve learned about making portable binaries

I don’t claim to be a master at linking and ELF (Linux) executables, but there’s some tricks I’ve learned that I wish someone had explained to me back then.

There’s two problems to resolve to make clean distributable binaries: dependency hell and the libc compatibility. By solving both, we can get an executable to run on any recent Linux system, regardless of the distribution and installed packages.

Dependency hell

Compiling to a binary is a two-step process. There’s the actual compiling, then the linking. In the first step, each source file gets turned into an Object file (.o extension). Then all the .o and the .a files are put together in a binary and the ELF meta-information is added. .a files are static library files. If your code uses an external library and that library has previously been statically compiled, then the .a file can be directly embedded into the binary. The ELF headers contain data such as “where is the main?” and “where should I look for the dynamically linked libraries (.so files)?”

Back in the old UNIX days, the thought of embedding .a files into binaries was considered a bit crazy. Even if a library is only 50Kb, it would be duplicated into ALL of the executables on your system where it is used. It was also thought that if executables relied on a central .so file, then that file could be updated with bug fixes once and all binaries would benefit without having to be recompiled. Dynamic linking means that on startup, an executable will look for the required .so files on your system. In practice, the massive complexity incurred by dynamic linking makes it a nightmare for binary distribution. You might have an application that needs version 0.9.x and another one that needs 0.10.6, and now you’re stuck in “dll hell” on Linux. They might not be backward or forward compatible. Package managers exist in part to track dependencies like that and make sure that all packages installed through it agree on the version that a shared dependency will have. But for people that need to distribute an executable that will work on ANY Linux system it means we just can’t rely on the central libraries.

It also means we can’t ask our users to apt-get install or yum install or even “manually compile” anything because it could clash with another version currently installed that other programs rely on. So what about statically embedding everything into the binary? We could go through every dependency our program has and manually modify their Makefile or autoconf or CMakeList.txt, etc, to make them output .a files that we can then use. It’s painful, but it works, except… if our dependencies use .so dependencies themselves -and they frequently do- then it doesn’t work and we’re still stuck with dynamic linking.

In my opinion, the easiest and safest way to fix all these problems is to make our dynamic dependencies behave like static ones by shipping the .so files and making the binary use them over the local ones. First, I compile everything using the defaults. Then I examine the resulting binary with the ldd command.

ldd1

The dependencies that start with /usr/ are those that will need to be shipped, except for libstdc++ because that’s another can of worms. Notice how ldd couldn’t find one of the dependencies? This is what happens if I were to run the executable:

ldd2

It happens because the system went to /etc/ld.so.conf to get a list of directories to look for libev.so.4 and in the end couldn’t find it. So I copy it from my dev laptop and put it in a folder named lib and distribute that long with the executable itself. But then I need to instruct my users to execute it like so: LD_LIBRARY_PATH=/current/working/directory/lib ./myexecutable. Of course they’re going to get it wrong or forget and then blame us. Remember how at the beginning I mentioned that the ELF headers contain a field that says where to look for the dependencies? That field can be edited with the patchelf utility.

patchelf --set-rpath '$ORIGIN/lib/' myexecutable

That solves the first problem. The executable will look for its dependencies in the local lib folder first before looking anywhere else on the system.

libc compatibility

Libc is the C standard library. It provides an interface between the code and the kernel. Virtually every binary will dynamically link against it, even non-C ones. The problem is that libc is 100% forward compatible, but absolutely not backwards compatible. In other words, binaries compiled on a system with libc version 2.13 will work on any system with version 2.13 and up, but will not work at all on any lower versions. On Reddit, NotUniqueOrSpecial mentioned that symbol versioning can also be used to solve this exact problem.

ldd3

Compiling with an older libc is not an issue unless your program happens to need new functionality or bug fixes, which has never been an issue for me. Downgrading libc is seriously not a good idea. The best and safest way to compile with an old libc is to install an older version of Debian or CentOS in a virtual machine. Then I install all the development tools and build the application for release there. I personally build on Debian Squeeze, which is old enough to guarantee compatibility with all the major distributions still in use, but recent enough to support my development toolchain.

As I said at the beginning, I’m not a low level expert and there is probably a cleaner way to achieve truly portable applications on linux without making the users compile from source other than the one described here, I’m looking forward to learning it. Please leave a comment if you have any suggestions or criticisms.

9 thoughts on “Things I’ve learned about making portable binaries

  1. The good way forward is to package your app as a container and package all dependencies of your app inside the container. Learning about using docker containers to do this would be a good place to start.

    • Honestly, that’s side-stepping the problem. There is real value in understanding how to make a portable binary. Docker is a security nightmare and it lives in a different world with its own routing, port forwarding, configurations, etc, and many standard Unix tools can’t be used there. There’s also a noticeable performance hit that is unacceptable when, say, a 10% difference means thousands of dollars more to spend in servers.

      Don’t get me wrong, Docker is revolutionary. Just don’t expect it to fix all your problems magically.

    • Ridiculous. A “good way”? If the topic is about portable binaries, how can relying on a huge dependency like “docker” be considered a good idea? When someone responds to a technical problem by side stepping the real issue and suggesting some hip library/framework, it reminds me of the quote “if all you have is a hammer, everything looks like a nail”…

  2. Hi,

    These are good tricks if you’re working in your own code and your own distribution mechanism, if however you want to get static binaries for popular projects you can grab some from the recent static distributions (you could also learn a couple of trick by reading its recipes, like using alternatives libc libraries), eg. bifrost, morpheus, rlsd2, etc.

    In the last case you can also use static-get to download any of the ~800 available packages, eg.

    sh <(wget -qO- s.minos.io/s) bash curl wget ffmpeg …

  3. In addition to the other static linking woes, glibc can’t be statically linked because of its method of support for nsswitch — it still requires a greater-than-or-equal version of glibc at runtime even if a tool like ldd reports it’s a static binary.

    Shared libraries are just a hive of villainy and shame, in general.

  4. I have run into many of the same issue that you seem to be addressing. I’m currently working with an embedded system where we cannot install ANY packages. Each of our applications needs to have ‘everything’ needed to run it in a single directory.
    As a 2-step process:
    1) I use the ‘cpld’ script http://h3manth.com/content/copying-shared-library-dependencies to copy all the dependencies of a binary executable to a specified directory.
    2) I use patchelf to patch the r-path of all the .so’s to run from the current directory.

  5. Great post, thanks!
    It seems like it might be a lot of extra effort, but have you tried statically linking musl as your libc? I do wonder how much larger all your .so dependencies would become since they’d all be statically linked to it too.

Leave a Reply