In a previous post, I blogged about Mach-0 binaries, in particular the size of a simple Hello World program. Twelve kilobytes of binary code just to display Hello World, that is a lot. So I wrote a program that parses Mach-0 binaries, displays their content, and tries to compact the file. Compaction is done by shuffling around the different segments in the file, adjusting the offsets of the sections of the segments (see Figure). Using this technique, I brought the Hello World program to 4K. This is quite hacky, because it basically means the same page on file gets mapped into three different memory locations.
The reason I can’t go below with the current technique is that the linker puts the binary code at the end of the first segment between
0fff in memory and in the file (the binary of hello world I’m using does not have a page-zero segment). Even if I rewrite the code section’s offset in the file, the loader basically maps the file into memory, so what ends up in the memory range is
0f6c–0fff is whatever is in the file in that range, and if the file is shorter, this area is filled with zeroes. I could change the entry point address of the program (this is defined in register
eip), the problem is that the
start function that gets linked in code contains absolute adresses.
So this means that to make the program smaller, I need to change the structure of the code (or rewrite addresses, but this is really annoying) probably by getting rid of the
start function. More fun to look forward to.
If you are interested, you can download the archive of the tool along with the code to build the hello-world, this is of course worse than experimental code and should not be used in any way! You can also download the resulting Intel binary.