Recursive problems

Recursivity is a nice way of expressing complex code. Consider the problem of displaying collections of collections of collections in HTML. For each depth level, we want to start a new sub-list. The following code does this in Python in a very compact and expressive way:

def ToHTMLList(thing):
  try:
    lines = itertools.imap(ToHTMLList, thing)
    return '<ul><li>%s</li></ul>' % '</li><li>'.join(lines)
  except TypeError:
    return str(thing)

Except there is a fatal bug in this code, can you find it?

The code works fine with a list of number [1,3,4,6] or recursive collections [[1,2,3][4,5,]] of any type of object that defines a proper __str__() method. On the other hand, the code crashes if given a string, or lists of strings. The code goes into an infinite recursivity loop. This is due to a strange design artefact of python: strings are infinitely recursive structures: If s is a non empty string, then s[0] == s[0][0] == s[0][0][0] == s[0][0][0][0] etc.

The fact that a string is considered as a collection of characters makes sense if they are separate entities. If single characters are also strings, this decision seems very arbitrary, a string might as well be a sequence of words or a sequence of lines. The later is actually the choice for files, even though both files and strings are basically sequences of bytes. The fact that the latter has an associated position pointer does not seem a good reason for considering them different collections. Things get even more confused if you consider unicode strings. A unicode character is a very fuzzy entity, and accessing a single character makes as little sense as accessing the four middle bits in a floating point number.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.