Yet a smaller world, going assembly

Hello World Disassembly

In my quest to write a smaller Mach-0 Hello World program, I have reached the point where it was time to dust off my assembly skills. Finding a hello world program in assembly for OS X was easy. I found a small snippet on the web and compiled it following the instruction contained in the file (gcc hello.s -o hello), and get an executable that is 12㎅ long. Looking at the binary, we realize that the binary contains all the scaffolding for dynamic libraries and such, even though it does not depend on any external library and does the system calls explicitly (int $0x80). What I needed to do, is link the code statically. It turns out that linking stuff statically is never used in OS X, except for kernel development, still the linker is here. So I wrote a small Makefile

hello: hello.o
  ld -static -seg1addr 0 -segalign 2 -s -dead_strip -no_uuid hello.o -o hello -e _main

hello.o: hello.s
  gcc -c hello.s -o hello.o

The -static enables static linking (and actually dispatches control to another linker, called ld_classic) the second parameter specifies that the first segment (the code), should start at adress 0, this disallows the generation of a page 0 segment (the normal flag is not recognized in this mode), the next thing is to tell the linker to align all segments on a 2 byte boundary, so that stuff gets packed in neatly. Then we want to get rid of dead code and prevent the generation of an identifier for the code. Finally, we specify the code’s entry point (the symbol _main).

The resulting binary code is: 444 bytes (3.5% of the size of binary produced directly by gcc). Clearly we have come a long way, but we could still win a few bytes, first by getting rid of the empty LC_SYMTAB section, then by moving the data inside the zeros in the header area, for instance, if one segment could be renamed to the string “Hello World”, we would just need to point there. After all, the binary can be compressed down to 173 bytes. Here is the summary of the binary’s content:

Load command 0
      cmd LC_SEGMENT
  cmdsize 124
  segname __TEXT
   vmaddr 0x00000000
   vmsize 0x000001bc
  fileoff 0
 filesize 444
  maxprot 0x00000007
 initprot 0x00000005
   nsects 1
    flags 0x0
  sectname __text
   segname __TEXT
      addr 0x00000188
      size 0x00000034
    offset 392
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Load command 1
     cmd LC_SYMTAB
 cmdsize 24
  symoff 0
   nsyms 0
  stroff 0
 strsize 0
Load command 2
        cmd LC_UNIXTHREAD
    cmdsize 80
     flavor i386_THREAD_STATE
      count i386_THREAD_STATE_COUNT
	    eax 0x00000000 ebx    0x00000000 ecx 0x00000000 edx 0x00000000
	    edi 0x00000000 esi    0x00000000 ebp 0x00000000 esp 0x00000000
	    ss  0x0000001f eflags 0x00000000 eip 0x00000198 cs  0x00000017
	    ds  0x0000001f es     0x0000001f fs  0x00000000 gs  0x00000000

4 thoughts on “Yet a smaller world, going assembly”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: