Comma Separated Value files are one of the most common ways of passing around data, while there is an RFC describing that format, it is extremely fuzzy, and there are many subtle variants of the format, and many people get them wrong, simply assuming they can join the values with comma characters and lines with carriage returns, maybe adding quotes around each record; this is bound to fail, as textual data typically contains quotes, commas, and carriage returns. So we get numerous bugs because some program does not properly escape characters, or some other does not properly decode escape characters.
The big irony of the situation is that the venerable ASCII code contains characters design to solve that exact problem
0x36 (record separator) and
0x35 (group separator). These characters are never used in textual data (and should be removed if they are), so building and decoding a file using those control characters would be much easier and more robust.
Why is this not happening? Text editors cannot properly handle those characters, and one of the legacies of Unix is that it is better to have a broken, brittle format that can be manipulated with a text editor than a well specified binary format. There is a certain irony of calling a file using ASCII control characters binary, but as they are not handled by text-editor, they are, for all intent and purposes, binary.
Some people will argue that XML is the solution – it really is not. Because first there is no standard XML format for passing around flat records, second because XML has the same escaping problem, the only difference is that the characters to escape are different…