In my quest to write a smaller Mach-0 Hello World
program, I have reached the point where it was time to dust off my assembly skills. Finding a hello world program in assembly for OS X was easy. I found a small snippet on the web and compiled it following the instruction contained in the file (gcc hello.s -o hello
), and get an executable that is 12㎅ long. Looking at the binary, we realize that the binary contains all the scaffolding for dynamic libraries and such, even though it does not depend on any external library and does the system calls explicitly (int $0x80
). What I needed to do, is link the code statically. It turns out that linking stuff statically is never used in OS X, except for kernel development, still the linker is here. So I wrote a small Makefile
hello: hello.o ld -static -seg1addr 0 -segalign 2 -s -dead_strip -no_uuid hello.o -o hello -e _main hello.o: hello.s gcc -c hello.s -o hello.o
The -static
enables static linking (and actually dispatches control to another linker, called ld_classic
) the second parameter specifies that the first segment (the code), should start at adress 0, this disallows the generation of a page 0 segment (the normal flag is not recognized in this mode), the next thing is to tell the linker to align all segments on a 2 byte boundary, so that stuff gets packed in neatly. Then we want to get rid of dead code and prevent the generation of an identifier for the code. Finally, we specify the code’s entry point (the symbol _main
).
The resulting binary code is: 444 bytes (3.5% of the size of binary produced directly by gcc). Clearly we have come a long way, but we could still win a few bytes, first by getting rid of the empty LC_SYMTAB section, then by moving the data inside the zeros in the header area, for instance, if one segment could be renamed to the string “Hello World”, we would just need to point there. After all, the binary can be compressed down to 173 bytes. Here is the summary of the binary’s content:
Load command 0 cmd LC_SEGMENT cmdsize 124 segname __TEXT vmaddr 0x00000000 vmsize 0x000001bc fileoff 0 filesize 444 maxprot 0x00000007 initprot 0x00000005 nsects 1 flags 0x0 Section sectname __text segname __TEXT addr 0x00000188 size 0x00000034 offset 392 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Load command 1 cmd LC_SYMTAB cmdsize 24 symoff 0 nsyms 0 stroff 0 strsize 0 Load command 2 cmd LC_UNIXTHREAD cmdsize 80 flavor i386_THREAD_STATE count i386_THREAD_STATE_COUNT eax 0x00000000 ebx 0x00000000 ecx 0x00000000 edx 0x00000000 edi 0x00000000 esi 0x00000000 ebp 0x00000000 esp 0x00000000 ss 0x0000001f eflags 0x00000000 eip 0x00000198 cs 0x00000017 ds 0x0000001f es 0x0000001f fs 0x00000000 gs 0x00000000
Dans la même catégorie (moi je ne capte plus rien à ce niveau, ça m’émerveille juste) :
http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html
https://stackoverflow.com/questions/284797/hello-world-in-less-than-20-bytes
(Elle a déjà 6 ans cette discussion !)
PS : Le pingback depuis https://wiesmann.codiferes.net/wordpress/?p=682 ne fonctionne plus, le domaine a changé.
C’est corrigé, merci.