(English) Python woes – Libraries
One might argue if issues in the libraries associated with one programming language are part of the language. I would argue it is, simply for the fact that in practice nobody uses a language without libraries (except for writing one-liners to show how cool a language is). One of my annoyances with Python is that although the language has a rich set of libraries doing various stuff, they are often inconsistent and often feel they have been written for an old version of the language and do not use any newer features or other recent libraries.
One of the very elegant features of Python are generators, this avoids horrible hacks like callbacks, yet, as of Python 2.6, the way to go over a file-system is using a method that takes a callback as an argument:
os.path.walk. Of course it would be possible to write an adapter that uses
walk to implement a generator, but that should be there by default. Another nice addition of Python 2.6 is
collections.nametuples which lets one define light-weight classes that behave like tuples, but whose fields can be accessed by name, this is a nice way maintain backward compatibility while moving to a more readable model. Some python classes implement their own ad-hoc named tuple classes, for instance
urlparse.ParseResult, some functions still return un-named tuples, like
os.popen3. Having code that manipulates tuple fields just using their position is very unreadable and error-prone.
The classical example of the baroque structure of Python’s libraries is those related to time. The package to use when manipulating dates is
datetime. You typically create instances of
datetime.datetime by either using the static method
fromtimestamp(). Now both those method have a UTC variant, which builds an instance in the UTC time-zone. First problem: the instance does not store in itself in which time-zone it was created – not even a boolean that tells if the data is UTC or local. Basically if you want time-zone support, you need a package that is not part of the default installation. The other problem with
datatime is that the methods provided by this class are not symmetric, in Python 2.6, there is an
fromtimestamp() method, but no
totimestamp() method, so the way to get this is
time.mktime(d.timetuple()), which is not exactly readable. Similarly, the
datetime class has a
isoformat() method to display the date in ISO 8601 format, but there is no method to parse dates in that format.
Another example of inconsistency are the serialisation libraries. Python 2.6 basically supports three serialisation formats out of the box: pickle, json and plist. The last one was added with Python 2.6. Here are the method to manipulate those types, can you spot the problem?
Probably the most inconsistent set of libraries of python are those to execute sub-processes, there are four families of them. The
subprocess pretends its aim is to replace the other libraries, but none of them admits in its inline documentation that it might be deprecated. Same thing goes for command-line argument parsing, there is
argparse. Again, the online documentation hints that the first two are deprecated and that one should be using the last one, but neither inline documentation mentions it.
Unsurprisingly there are two libraries to open urls:
urllib2, you might think that
urllib2 would be the most advanced one, but the one that supports RFC 2397 data protocol urls, is, of course,