php_substring_bug

Substring considered harmful – an example

php_substring_bug

I have complained on languages that allow substring operations. It should come as no surprise that I found an occurrence of that bug in the theme I use in my blog.

As you can see on the image, the post preview contains some corrupted Unicode data. The thing is, the title this data is generated from is perfectly valid and contains the following text:

<!–:fr–>Bonne Année<!–:–><!–:en–>Happy New Year!<!–:–><!–:ja–>明けましておめでとう!<!–:–><!–:de–>Einen Guten Rutsch ins neue Jahr!<!–:–>

The weird comment tags are leftover of the previous plugin I used for handling multiple languages, and serve as delimiters between languages. They should be ignored by the rest of system.

So why do I end with corrupt data? The problem lies in the following PHP snippet (there are two of them in fact):

<header class="entry-header">
  
  < ?php 
    if (strlen(get_the_title()) >= 85) { ?>
      <h1 class="entry-title"><a href="<?php the_permalink(); ?>" data-title="< ?php the_title(); ?>" rel="bookmark">
  < ?php echo substr(get_the_title(), 0, 84)."...";
  }
      
    else { ?>
    <h1 class="entry-title"><a href="<?php the_permalink(); ?>" rel="bookmark">
  < ?php the_title();  
    } 
       ?>
</a></h1>
</a></h1></header>

The intent of the author of this code is pretty clear, if the entry-title is longer than 85 characters, cut the title and append an ellipsis. This is a code pattern you will find in many user-interface codes.

Problem is, this code does not do what the author think it does. In PHP substr is defined in bytes, not characters. In UTF-8, characters are thus typically 2 bytes long and Kanji (like 明けましておめでとう!) are three bytes. Here the 84th byte happens to fall in the middle of the ‘し’ character, and cutting there produces invalid UTF-8 data. The biggest irony is that because string length is computed before the invisible tags are stripped, the selected cut point is wrong anyway…

What is the fix? PHP actually has functions to get the width of a string in runes and cutting to the right number of unicode characters: mb_strwidth and mb_strimwidth.

You can fix your sixteen installation by replacing the following files:

Flattr this!

Convoi – Cyberpunk Publié il y a environ 3 jours par Thias Vue de nuit de Shibuya, à Tōkyō Dès le début de la conception de Convoi il me paraissait évident qu’il y aurait un monde cyberpunk, avec des voies rouillées sur des docks détrempés par les pluies acides. Si j’ai des idées assez claires sur ce qui devrait se passer dans ce monde dans le premier scénario, le monde à proprement parlé est resté quelque chose de générique, ce qui m’ennuie.

Properly Localising the Sixteen theme

Convoi – Cyberpunk
Publié il y a environ 3 jours par Thias
Vue de nuit de Shibuya, à Tōkyō
Dès le début de la conception de Convoi il me paraissait évident qu’il y aurait un monde cyberpunk, avec des voies rouillées sur des docks détrempés par les pluies acides. Si j’ai des idées assez claires sur ce qui devrait se passer dans ce monde dans le premier scénario, le monde à proprement parlé est resté quelque chose de générique, ce qui m’ennuie.

This blog currently uses the sixteen WordPress theme. Friday I turned on a multi-language plugin (xili) and theoretically, everything should have worked out of the box:

Sixteen has been already translated into French, Spanish, Russian, Japanese, Arabic and supports translation into more languages.

In practice French did not work out of the box: the french locale files were corrupt, not to mention badly translated, and the plugin uses the timeago JQuery module without setting up localization, you always get English relative time. So here are the steps to fix the Sixteen theme (version 1.3.0.4) for French. Modulo the localization files, this should hopefully work for other languages:

  1. Backup everything
  2. Replace the content of the french localization file sixteen/languages/fr_FR.po with this one.
  3. Use a tool like poedit to generate a new version of sixteen/languages/fr_FR.mo. If the file is zero byte long, there is a problem.
  4. Replace the file sixteen/js/jquery.timeago.js with the latest version (1.4.1).
  5. Download the locale specific timeago configuration for the language you need, in my case it was jquery.timeago.fr.js, put these files into the sixteen/js/ directory.
  6. Replace the content of file sixteen/functions.php with this updated version

If you don’t care about French, you skip steps two and three, but note that the compiled localization files for Spanish are empty, and missing for Arabic, Japanese and Russian, so I doubt these will work straightaway.

Flattr this!

Screenshot a blog post in french with partial localisation

Multilingual – New take

sixteen_localised_kindof

I have given another shot at making this blog run in multilingual mode, this time with another plugin (xili), which seems to be better integrated with internals of WordPress. This forced me to fix some the french translation file of the sixteen theme, which was completely broken. Things kind of work now, i.e. the blog’s chrome will render in French for a blog which is French, but there still is one bug: the theme uses the JQuery timeago module, and generating date texts within Javascript means that the PHP localization framework won’t work. There is an update version of timeago, which includes support for localization, I’m not sure to what extent I’m willing to fix the current theme or just switch to better designed one.

Flattr this!

Blog blog blog. Soit c’est une le début d’une symphonie ou un test d’application, à vous de juger… Hors donc, comme un bazillon d’autres personnes, j’ai à présent un blog. Le but de se truc n’est pas de faire avancer la civilisation ou la culture, mais d’avoir un endroit ou exposer mes idées idiotes, que je pense avoir fort nombreuses. Dans quelle mesure cette initiative pourra être maintenue est une question ouverte, j’espère qu’elle ne souffrira pas d’être une initiative de janvier.

1300 posts…

Habeamus Blogum
Blog blog blog. Soit c’est le début d’une symphonie ou un test d’application, à vous de juger… Or donc, comme un bazillon d’autres personnes, j’ai à présent un blog. Le but de se truc n’est pas de faire avancer la civilisation ou la culture, mais d’avoir un endroit ou exposer mes idées idiotes, que je pense avoir fort nombreuses. Dans quelle mesure cette initiative pourra être maintenue est une question ouverte, j’espère qu’elle ne souffrira pas d’être une initiative de janvier.

More than nine years ago, I started this blog, today’s entry is the 1300th post. The first post, was unsurprisingly just a small introduction in french.

At that time, the blog was running WordPress version 1, provided by the french ISP free, since then, the blog migrated between multiple domain names and hosting providers, and changed the themes multiple times. Many of the old entries are about articles and pages that are no longer on the internet, others rely on technologies that have died out, or plugins I have meanwhile removed. In the long run, keeping a blog consistent is a pretty hard task.

The subjects I write about have not changed that much: role-playing games, technology, languages and things around me. Sadly I have stopped writing in Japanese as this has gotten too hard for me. The guiding principle of this blog was always to keep the flow going, not doing perfect entries, I kept true to the motto probablement n’importe quoi, which translated to probably random things.

This blog typically gets 50 visits a day, but there have been spikes when one or the other article was referenced in a high-traffic page. Interestingly, comments are not so much proportional to the traffic, I get most of the comments from a small circle of friends, typically on pages about general subjects (books, online article), while most of the traffic is on post which are more technical. Flattrs are also not correlated with traffic, I got most flattrs on blog entries that are personal. None of my high traffic posts ever got flattered, regardless of the language.

If one excludes spam-bots, comments were nearly always civil and to the point. In all these years, I only got one agressive comment when I wrote about gay mariage, again in French. This was at the time when the law changed in France. One more reason to be reluctant to talk about politics in public forums.

This blog was a pretty spontaneous thing, and to a large extend still is, as it acts mostly as a scratch-pad for my ideas and I have no plan for what I will write about next. This blog will go on as long as I have ideas for it, I hope this will for a few more years.

Flattr this!

Mi’kmaq Hieroglyph Prayer Book

Localisation…

 Mi’kmaq Hieroglyph Prayer Book

The WordPress 3.9 update broke by blog, it took me some time to find the culprit: the qtranslate plugin. It was causing all sort of minor problems in the past, so I decided to just turn it off. I started using it at a time when I was using way more different languages in this blog, and I never managed to find a satisfying way to use it.

The underlying problem still remains: I write in this blog in multiple languages, something the Web in general and WordPress in particular have trouble coping with, simply trying to solve it in an add-on simply does not work. Internationalisation, like security and reliability are not something you can add as an afterthought.

Mi’kmaq Hieroglyph Prayer Book by Dennis Jarvis Creative Commons Attribution-ShareAlike 2.0 Generic

Flattr this!

Screen Capture of ImageOptim

Optimised images

Screen Capture of ImageOptim

ImageOptim is a really cool tool for optimising image data, the latest version also includes the Zöpfli compression algorithm, developed by some colleagues. Basically you drag/drop the JPEG and PNG files you want to use on your web-site and the tool will reduce them as much as possible, the results are pretty impressive.

For instance, I’m using the Sixteen theme for WordPress, which is nice, but the images have not been optimised. By running the tool, I brought down the size of the image set by 74K (out of 465). You can download the optimised images here. They are pixel-identical to the ones in Sixteen 1.2.1.6.

Flattr this!

Title of this blog as seen using the Sixteen theme

New Theme for 2014

Title of this blog as seen using the Sixteen theme

I have been using the same Japan style theme for this blog ages, it started to feel old, and there was only so much I could hack around before either rewriting it from scratch or picking a new one. I chose to do the latter, and so I switched to Sixteen which is more responsive and handles better screen resizes. Doing this also has been a good exercice in moving many of the customisations I had done out of theme into Jetpack. This means that I should be able to switch themes more easily if I choose to do so. There are still a few kinks to iron out, mostly in the presentation of tables, but we will get there. Thank you for your patience.

Flattr this!

retina2x_ui

WP Retina 2×

User interface for the Retina 2× plugin

Following my article on high resolution graphics for my blog, I have looked around for some plugin that would handle high DPI graphics for this blog. I have found WP retina 2×, which use Apple retina approach to the problem: have a second image at twice the resolution with the string "@2x" appended in the name before the extension, if the display is detected to be in high resolution, some javascript tries to rewrite the urls for the various images. The plugin does not generate the 2× graphics automatically for existing images, but provides a tool to do so: for each image in the image library, one can automatically generate the versions at twice the resolution.

This is particularly useful for the thumbnails, in this blog they are typically 250 pixels wide, when the thumbnail was generated from a higher resolution image, then the plugin can generate a 500 pixel wide thumbnail and substitute that on high DPI displays. Of course, this does not work for hand-optimised low palette count PNG files, which I often used.

I generated substitute images for some the thumbnails in this blog, but for many the available image just has to few pixels, or was added to the blog using an older media mechanism that the plugin does not recognise, but still, it works, and from what I saw the images do look better. Still I only do periodic checks so if you see issues, please let me know.

Flattr this!

Spring Cleaning – Broken Links

👷

As this blog slowly approaches the thousand post landmark and more than seven years of existence, I felt some spring cleaning was needed: the blog itself has migrated between three providers, many versions of Word­press and a myriad of plugins. While things are pretty stable those days, I added a plugin to track broken links, and the results are not pretty. Originally I had more than 400 broken link, I now brought that number down to a bit over 100. Exactly like my first home-page, this site is still under construction. I apologise for the inconvenience.

A large number of the broken links where internal to this blog: Word­press, by default uses absolute url for images, so when each time the hostname changed, the link would break. I still found some refe­rences to free.fr, the hosting provider I had when I lived in France. There are also still references to pages generated by a gallery plugin a stop used ages ago.

Still many external links have broken, even links to the wikipedia can get stale as the relevant page gets removed and its content reorganised. Links to commercial websites break fast, academic pages even more. Strangely enough, personal home-page often last longer, they sometime moves, but the content can be found with a simple internet search and the links repaired.

The sad thing is that in many instances, I linked to content instead of copying it, assuming it would always be there. Increasingly for illustrating this blog, I prefer to find some creative common image and have a copy on my blog and reference the original with a link, this way if the source goes away, the blog does not look ugly.

Even links to online galleries like picasa break as the site updates its structure, policies, and protocols. This for me puts a serious limitation to the whole concept of cloud computing, I have files that I have moved from hard-drive to hard-drive for close do twenty years, how long will files in an online storage system last?

Flattr this!

Turning off wordbooker

🔩

Yesterday, I turned off the wordbooker plugin from this blog. This was a good plugin, the work of a dedicated engineer, and one of the few projects I donated money to. The author recently expressed frustration with the project and I understand him, Wordbooker works in the interface between two large systems: WordPress and Facebook, with their own changing APIs and policies. The result was that despite his best efforts, the system was frequently broken.

For me this highlights the problems with “platforms” that change APIs and policies with each moon phase. Agile development and experimenting is a good a thing, but once your systems pretends to be an ecosystem, you need to give developers a stable API to code against: their core objective is to build interesting things, not chasing an ever-changing API.

As for functionality, I realised the important thing for me is to automatically push new posts to Facebook. Facebook broke the RSS import feature a long time ago, so I need some kind of glue to doe the pushes, ifttt.com provides this – I was already using it to push to twitter. As for moving comments from one system to the other, I realised this was not such a good feature. Conversations on Facebook are private and authentified, they are not on the blog, synchronizing the two feeds does not make much sense.

A any rate, many thanks to Steve Atty for his great work on wordbooker.

Flattr this!