
One important aspect of a good programming language is that the effect of code should be predictable. I wanted to write a small utility program that takes the its input from stdin
and transforms it to make it bigger on stdout
. Typically the kind of quick hacking you could do in Python. So I wrote the following code:
#!/usr/bin/python # -*- coding: utf-8 -*- import sys def main(argv): for line in sys.stdin: line = line.rstrip() sys.stdout.write('\x1b#3') sys.stdout.write(line) sys.stdout.write('\n\x1b#4') sys.stdout.write(line) sys.stdout.write('\n') if __name__ == "__main__": main(sys.argv[1:])
This worked fine with commands which terminate (and close the file descriptor) but failed with continuous commands like ping
: the code blocks and seems to wait for the stdin
descriptor to be closed. It is not just a buffering issue, because eventually ping would fill up whatever buffer.
I tried to turn off Python input buffering with the -u
command line flag, no change. Switching from the implicit iterator to using readlines(80)
, i.e setting an explicit buffer size did not solve the issue either. So I ended up rewriting the code the following way:
#!/usr/bin/python # -*- coding: utf-8 -*- import sys def main(argv): while True: line = sys.stdin.readline() if (not line): break # EOF = empty line line = line.rstrip() sys.stdout.write('\x1b#3') sys.stdout.write(line) sys.stdout.write('\n\x1b#4') sys.stdout.write(line) sys.stdout.write('\n') sys.stdout.flush() if __name__ == "__main__": main(sys.argv[1:])
The fact that there is a functional different between calling readlines()
or repeatedly calling readline()
and detecting end of file is really not intuitive, and typically the type of black magic that languages should avoid. If python files were generators, then this mess would not be there.
As almost always, I don’t really agree with your comments about Python, is it, or not, a good language. For me, it’s a matter of taste.
However, about readline() and readlines(), there is a bit of a convention here : readline() return one line at a time, readlines() return a _list_ of lines and tries to buffer it complettely in memory. So, repeateadly call readline() doesn’t exactly give you the same result, you must make it a list before.
The second important thing is that the file interface of Python waits for an EOF before processing the file itself. So, if your file is ping continuously piping lines in the sys.stdin of Python, you should not see anything before it returns an EOF.
So, your example is the right thing to do in Python 2. I would have done something like this :
“””
import sys
def main(argv):
while True:
try:
line = sys.stdin.readline()
except KeyboardInterrupt:
break
if not line:
break
line = line.rstrip()
sys.stdout.write(‘\x1b#3’)
sys.stdout.write(line)
sys.stdout.write(‘\n\x1b#4’)
sys.stdout.write(line)
sys.stdout.write(‘\n’)
sys.stdout.flush()
if __name__ == “__main__”:
main(sys.argv[1:])
“””
But please note that this is not the way it would happens in Python 3, where the annoying buffering you mentionned has been removed, and your original code sample works just fine… (Python 3 is a bit more modern, shall we says…)