One of the most annoying claims about python is that it is a high-level language. There is no clear definition of what a high-level language is, but the general idea is that the programmer does not need to think about the low-level details of the code that is generated and can work with more abstract types and concepts. The idea being that the programmer writes his intent in a clear way, and the language does the right thing.
for v in range(large_value): doStuff()
A typical example of an operator that does not do the right thing is
range. Range is very convenient and used a lot in the Python code I have read, it is the pythonic way of avoid index counters when looping over ranges. Range has strange semantics: the meaning of the first argument changes if there is a second one, so you cannot call it with keyword arguments, like
range(start=0, stop=10). But the true problem of range is that should never use it, as it broken by design: range just builds a list with said range, potentially using a huge amount of memory. The code in the snippet is correct, but will just kill your memory.
if v in xrange(a, large_value): doStuff()
People used to Python will tell you to just use
xrange which does the right thing, kind of: xrange does not build a list, but creates iterators when needed. In Python 3,
range range behaves like
xrange. Consider the code on the right. Readable, semantically clear, will kill your machine: the search is done in O(n), where n is the size of the range. People who know Python will scoff and tell you one should not do this. Why not? If you think about it, a range is both a set (a collection of unique items), and a sequence (a collection of ordered items). What does python think about it?
x = xrange(10) isinstance(x, collections.Set) → False isinstance(x, collections.Sequence) → True x[1:3] → TypeError: sequence index must be integer, not 'slice'
Walks like a set, but python does not acknowledge that it is a set. Why is
xrange not a set? If it is a sequence, why can’t I read a slice of it? How hard would it be to write a proper range class for Python?
class IntRange(collections.Set, collections.Sequence): __slots__ = ('start', 'stop', 'step') def __init__(self, *args): if len(args) > 3 or len(args) < 1: raise ValueException() if len(args) == 3: self.step = args else: self.step = 1 if len(args) > 1: self.start = args self.stop = args else: self.start = 0 self.stop = args def __iter__(self): return xrange(self.start, self.stop, self.step).__iter__() def __len__(self): return (self.stop - self.start) / self.step def __contains__(self, value): if type(value) != int: return False if (value - self.start) % self.step: return False if self.step > 0: return value >= self.start and value < self.stop else: return value <= self.start and value > self.stop def __getitem__(self, index): value = index * self.step + self.start if self.step > 0 and value >= self.stop: raise IndexError() if self.step < 0 and value <= self.stop: raise IndexError() return value def __getslice__(self, start, end): return IntRange(self[start], self[end], self.step)
Voilà: a class that is functionally equivalent to
xrange, but can find if it contains a value in O(1), and supports slicing. What would be cool to add is all the set operations defined in the
frozenset class: intersection, union. What is a mystery to me is why
xrange is so crippled by design. Must be a pythonic thing.