Back to Latex

Latex page layout for the Ringstadt city

A long time ago, I was quite involved into Roleplaying. I played a lot with one game a friend wrote, Tigres Volants and created some material for that setting. I recently decided to gather the material related to one city, named Ringstadt, and make a page layout and publish it as a PDF.

Maybe because this was old material and I was feeling geeky, I decided to use Latex again. I used that tool a lot during my academic years, and I felt this would be more convenient to make a simple layout with some old text-files. This would also let me use a versioning system for the source. The source text is in RTF format that I recovered from Word for Macintosh 5 format, by running the whole thing in an emulator.

Converting from RTF to Latex sounded like a simple thing in theory, in practice the two first command-line tools I used crashed with a segfault, the third (unrtf) worked, but converted all non-ASCII characters to Latex escape sequences, I had to fuzz around with the configuration to get something readable. The text is in French, which means many accents, so I really wanted a source-code I can proof-read.

The good news is, Latex did not change much in the ten years since I left academia, the bad news is, Latex did not change much. There are many things I stopped worrying about when using computers: text-encodings, font management, image-formats. Latex is pretty much stuck in the 90’s, just to handle an input file in utf-8 format, you need the following packages:

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}

Do you know what the T1 code means? This is the encoding of the font, it also means that while you can input your text in UTF-8 format, Latex will not support Unicode, if your input contains a character that is not part of the T1 table, like say a double ended arrow, compilation will fail with an obscure error message. If I want to use cyrillic characters, I’ll need to load another codepage. I don’t think there is a way to tell Latex to just handle Unicode.

Error handling is another aspect where Latex stayed in the 90’s, I remember the error messages from GCC at that time, they were not helpful either. Nowadays there is clang which gives you helpful error messages in colour, with hints.

I just wanted to do an page-layout with the Helvetica font, french text, images, and floating boxes. I ended up with a header that includes 20 packages. Things kind of work (see the image), but floats are basically broken in Latex: you need to do everything manually, and they crash at the first difficulty: page breaks, foot-notes. Latex manages the impressive feat of having floats which are more broken that HTML (I use the wrapfigure package).

We are not talking about an exotic feature: just boxes with text flowing around, you see that in any magazine, and many web-pages. I was able to implement this in Word 5.1, more than 20 years ago, and it worked more reliably than what I get with Latex. Apple’s Pages software which I usually use for light word-processing can even handle non-rectangular floats, using the alpha channel of the image as a guide. You can also overlay floats over each other.

The main argument in favour of using Latex is that it does the right thing by default, but for French, even with the babel package, this is not really true. Latex will insert a space before double punctuation, but this is ugly, the proper thing to do would be to add a half-space. I could probably hack one package or other to get that result.

What stuck me is how much Latex is isolated from other systems: it does not use any operating system services for text processing, font-management, image-processing, rendering, so you end up with a very big system (2GB install), that is its own, old thing. Most of the things I complain in this post were already mentioned in a wish-list post, 10 years ago.

I basically did one chapter of the document, and I’m faced with a simple choice, go on fighting with Latex, with the knowledge that the final layout will be pretty mediocre, or swallow my pride, and just redo everything in Apple Pages…

Flattr this!

Markless

Markless Screen Capture

Expressing rich text, has been a thing since ASCII emerged as the standard for text representation. RTF, HTML, LaTeX all use the same idea of using text with some form of escape sequences to express formatting. On format that is getting traction is Markdown, which is used, among other things, for documentation on GitHub.

The always had some partial support for formatting, typically in the format of control-codes, Unicode deprecated some control codes (most of code-points in the C0 and C1 pages), but other forms of control with variant selector, graphical characters, and character compositing.

So I wondered if it could be possible to render Markdown using Unicode characters. Rendering Markdown using ANSI escape sequences is easy, tools like do it. What I wanted was pure text, i.e. something that could be copy-pasted into interfaces that only support text, embedded into code comments.

The result is , a small Python script that converts Markdown into Unicode text. Headings a code blocks are rendered using boxes built out of graphical character, emphasis using compositing underlines. I must stress that this is a hack, it has probably more values as a Unicode stress test than as an actual tool. Here is a sample output:

╔══════════╗
║ Markless ║
╚══════════╝
Markless is a small tool (a h⃨a⃨c⃨k⃨ really) that renders mark-down as plain text, 
using Unicode modifiers characters.
• Emphasis is rendered using underline modifiers.
• Lists is rendered using pretty bullets.
  Continuation is supported.
• Headers and code are rendered in boxes.
▌Blockquote is rendered using block characters
▌▌Second level

The tool is far from complete, and only supports a fraction of the Markdown commands. The code and an up to date version is available on Github

Flattr this!