Latex Spotlight Importer

This page contains information about a small project I have been doing on the side. A Mac OS X spotlight importer for Latex source code. There seem to be a few similar projects out there, and one such importer is bundled with texShop. Still they are not very satisfying as they usually simply index the latex source code.

My goal was to have something more advanced that simply importing the raw source text. I have written a small component that does rudimentary latex parsing and indexes the text content. It has the following features:

  • Does not require any Latex installation.
  • Extracts the title and the authors.
  • Tries to guess the encoding from the inputenc package.
  • Indexes the list of publications keys.
  • Full text indexing. The latex code is converted to plain text and indexed.
  • Command substitution. Certainc commands are substituted for a unicode string for instance \delta becomes δ. This can be customized using a standard property list.
  • Command copying. The parameter of certain commands is simply copied, for instance the content of the \textsf command is simply copied. Again, this can be customized.

The importer works on the limited number of papers I tried it on. If you want to try it out, you can download it. Please consider this as a really alpha version, the code is still very raw, and it is probably quite slow, as there are still at lot of logging statements. If you are interested in the source code, please contact me.

To use the importer, simply unzip it and put into ~/Library/Spotlight/. Normally, the importer is activated automatically, if not, you can type /usr/bin/mdimport -r ~/Library/Spotlight/LatexImporter.mdimporter to force the activation. To see what the plugin actually indexes for a given tex document, type in the following command mdimport -d bla.tex where blah is the tex document. You will get all the log messages and the values that have been extracted.

The source code for the importer is now GitHub.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.