Python woes – range

Python Logo

One of the most annoying claims about python is that it is a high-level language. There is no clear definition of what a high-level language is, but the general idea is that the programmer does not need to think about the low-level details of the code that is generated and can work with more abstract types and concepts. The idea being that the programmer writes his intent in a clear way, and the language does the right thing.

for v in range(large_value):
  doStuff()

A typical example of an operator that does not do the right thing is range. Range is very convenient and used a lot in the Python code I have read, it is the pythonic way of avoid index counters when looping over ranges. Range has strange semantics: the meaning of the first argument changes if there is a second one, so you cannot call it with keyword arguments, like range(start=0, stop=10). But the true problem of range is that should never use it, as it broken by design: range just builds a list with said range, potentially using a huge amount of memory. The code in the snippet is correct, but will just kill your memory.

if v in xrange(a, large_value):
  doStuff()

People used to Python will tell you to just use xrange which does the right thing, kind of: xrange does not build a list, but creates iterators when needed. In Python 3, range range behaves like xrange. Consider the code on the right. Readable, semantically clear, will kill your machine: the search is done in O(n), where n is the size of the range. People who know Python will scoff and tell you one should not do this. Why not? If you think about it, a range is both a set (a collection of unique items), and a sequence (a collection of ordered items). What does python think about it?

x = xrange(10)
isinstance(x, collections.Set) → False
isinstance(x, collections.Sequence) → True
x[1:3] → TypeError: sequence index must be integer, not 'slice'

Walks like a set, but python does not acknowledge that it is a set. Why is xrange not a set? If it is a sequence, why can’t I read a slice of it? How hard would it be to write a proper range class for Python?

class IntRange(collections.Set, collections.Sequence):

  __slots__ = ('start', 'stop', 'step')

  def __init__(self, *args):
    if len(args) > 3 or len(args) < 1:
      raise ValueException()
    if len(args) == 3:
      self.step = args[2]
    else:
      self.step = 1
    if len(args) > 1:
      self.start = args[0]
      self.stop = args[1]
    else:
      self.start = 0
      self.stop = args[0]

  def __iter__(self):
    return xrange(self.start, self.stop, self.step).__iter__()

  def __len__(self):
    return (self.stop - self.start) / self.step

  def __contains__(self, value):
    if type(value) != int:
      return False
    if (value - self.start) % self.step:
      return False
    if self.step > 0:
      return value >= self.start and value < self.stop
    else:
      return value <= self.start and value > self.stop

  def __getitem__(self, index):
    value = index * self.step + self.start
    if self.step > 0 and value >= self.stop:
      raise IndexError()
    if self.step < 0 and value <= self.stop:
      raise IndexError()
    return value

  def __getslice__(self, start, end):
    return IntRange(self[start], self[end], self.step)

Voilà: a class that is functionally equivalent to xrange, but can find if it contains a value in O(1), and supports slicing. What would be cool to add is all the set operations defined in the frozenset class: intersection, union. What is a mystery to me is why xrange is so crippled by design. Must be a pythonic thing.

One thought on “Python woes – range”

  1. range is one of those annoying things… I don’t think that “crippled by design” is a pythonic thing, since most of what people pretends to be pure pythonic ways of doing things are just ugly misinterpretation of the Zen of Python (since following it is was should define the “pythonic” quality). Well, since I am currently following the process to learn Perl 5, I know that “ugly” is a very relative thing, but Python is not Perl, exception should not be the rule and consistency of the API is a must, not an option.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.