nfirvine.comwiki

FatELF

Filed in: Writing.FatELF · Modified on : Fri, 27 May 11

As discussed recently on Phoronix and detailed at his site, Ryan Gordon aka Icculus (who you might know as the author of manymouse, which I wrote some Python wrappers for) has created FatELF, being the free software equivalent of OS X's Universal Binary. Read the article/site for details, but essentially it allows one to bundle multiple executable binaries into a single "fat" binary. The headers of the FatELF binary include a table with pointers to the various embedded binaries, such that ld.so would just pick out the correct one for that particular architecture and run it.

There are several comments I've heard around the issue that I thought I'd address.

This is really only useful for proprietary software on Linux.

Free software for Linux has many established ways of distributing binaries for multiple platforms: via source, to be compiled by the end user into whatever architecture binary they need; and precompiled by the distro into whatever myriad archs they support. Thus, the only parties that a fat binary format would benefit are the ones that can't distribute their software as source or via official repos. That is, namely proprietary vendors, which hurts free software.

Firstly, I'm not entirely sure that having proprietary software on a free software platform is necessarily evil: it promotes the usage of GNU+Linux as a whole, and may act as a gateway to producing free software; many pieces of software (Blender, for example) began life as proprietary and became Free later on in life. However, this is neither the time nor the place.

The problem with this argument is that they've excluded someone: me. I produce a little software for Linux here and there, and I think it can come in useful for some people. Thus far, I've only released as source, and occasionally release a binary that is produced by my system. By doing so, I'm limiting my potential audience somewhat: I think there's a significant number of people who:

  • Use GNU+Linux; and
  • Don't know how or otherwise are unable to compile from source; or
  • Don't want to put in the effort to compile from source (retrieving the proper dependencies is one significant hurdle)

So let's say I decide I've got some application that I think is pretty decent and a number of people are using it, but having to compile from source is either a deterrent or a show-stopper to new comers and updaters. How best to distribute a precompiled binary? There are at least a hundred and one solutions, but to enable the most number of users to install it, I would have to create a package for each distro on each arch for each version. It quickly becomes a nightmare, and I'm stuck packaging my app instead of improving it.

There is lots of useful software out there that you just can't get through repos, and without the help of the build farms and package maintainers at the disposal of the major distros, it's simply too much of a time-suck to provide precompiled binaries myself. FatELF can help with this by allowing me to only have to maintain one binary, and then only one package per distro, each containing the same binary. It does this in two ways:

  • I can roll different builds for different archs into one FatELF.
  • If the distros use FatELF to build their libraries on which my app depends, the system could easily have multiple versions of the same library installed. I imagine that I could state the (minimum) version required in the headers of my FatELF binary, and then the FatELF loader would automatically use the proper library.

Bloated binary sizes

OS X only has two architectures to support, while most GNU+Linux distributions need to support many more (e.g., Debian supports 11). This means that, assuming all binaries for all architectures are roughly the same size given the same source, binary sizes will grow by an order of magnitude.

There are really two concerns here, and they should be addressed separately. First is the transfer of such large files over the network, or making the size of an install disc too large. Second is the size of the installed system.

Most counterarguments to the transfer problem state that it is really a non-issue since Internet transfers continue to get cheaper and faster. I personally hate this response. There is no excuse for being wasteful, because waste always comes back to bite you. And while Internet speeds continue to get faster, there are many links in the chain which are almost at a standstill. The time to burn a DVD is roughly the same as it ever was (and if we spin a plastic disc any faster, we risk it tearing apart in the drive). Hard drives transfer speeds are the same as a decade ago (SSDs promise to solve this, of course), and I think people in general are moving to slower ones (laptops). With FatELFs, you have people downloading more than they have to.

Really, there is not much of a way around this. I suppose compression could help, as some portions of the FatELF might be similar. But ultimately if you need to produce a binary that will work for all users of all architectures, they will all have to download it, even the parts they don't need.

The standard counterargument to the storage problem is that binaries are small and don't account for much of the space used on your hard drive. Well, I think that they will, once they're ten times their current size. Also, we need to consider that a substantial portion of GNU+Linux installations are on embedded devices with small amounts of storage. And again, "waste not, want not".

This problem is a little more easily addressed. For most users, a ten-fold increase in the size of their binaries is not going to be that noticeable: after all, installations of Windows are hovering around the 10GB mark these days. For the users where space is at a premium, it is a trivial exercise to strip unnecessary archs out of the FatELF and adjust the header accordingly, just as the strip command is used to strip mostly superfluous debugging symbols from ThinELFs to minimise space-usage.

There is another side to this as well. Distro mirrors would have to accommodate the increased size. I'm not sure this is much of a problem, though.

But...

But none of this really matters.

I think Ryan made a bit of a mistep by demoing FatELF by recreating an entire distro (Ubuntu 9.04) out of FatELF binaries: I don't think that your entire OS really needs to be FatELF (or at least could be single-arch FatELFs). Moving your OS installation from one arch to another seamlessly is probably a fairly niche usage scenario. And by making the entire OS dual-arch, he's almost suggesting that we should all install FatELF binaries with 12 different superfluous archs, which is unnecessary.

None of this really matters because FatELF obviously has some uses, and the overhead of making the FatELF system available is (generally) completely negligible. It makes little sense not to include the capability. It's like, say, the drivers for some obscure filesystem included with the kernel. Yes, they're there taking up a little disk space, but they're not hurting anyone, and when I come across that filesystem eventually, I'll be glad they're there.

Ultimately, free software is about choice. And I would like the choice to be able to execute FatELFs. The distros can continue to ship ThinELFs as they see fit; there's obviously scenarios where a ThinELF would be better. But FatELFs sound like they have their place as well.


Powered by PmWiki