Pick a font able to properly render a string composed of Unicode characters with Perl

In the case of automated watermarking with randomly picked fonts within a Perl script, it is quite annoying to stumble on fonts missing many non-basic unicode characters (accents, etc). In French, you’ll likely miss the ê or ü or even é or à. In Polish, while the ł is often provided, you’ll like miss ź.

The Perl module Font::FreeType is quite convenient in this regard. The sample code here will try to find a font, within the @fonts list, able to render the $string.  It will pick the fonts randomly, one by one, and check every character of the string against the characters provided by the font. It will stop to pick the first one that actually can fully render the string:

use Font::FreeType;
use utf8; # must occur before any string definition!
use strict;

my @image_tags = "~ł ajàüd&é)=ù\$;«~źmn";
my @fonts = ("/usr/share/fonts/truetype/ttf-bitstream-vera/Vera.ttf", "/usr/share/fonts/truetype/zeppelin.ttf", "/usr/share/fonts/truetype/Barrio-Regular.ttf");
my %fonts_characters;
my $watermark_font;

# we want a random font: but we also want a font that can print every character
# (not obvious with utf8)
# loop until we find a suitable one (all chars are valid, so the chars counter reached 0) or,
# worse case scenario, until we checked them all (means more suitable fonts should be added)
my $chars_to_check = length("#".@image_tags[0]);
my $fonts_to_check = scalar(@fonts);
my %fonts_checked;
while ($chars_to_check > 0 and $fonts_to_check > 0) {

 # pick a random font
 srand();
 $watermark_font = $fonts[rand @fonts];
 
 # if this font was already probed, pick another one
 next if $fonts_checked{$watermark_font};
 $fonts_checked{$watermark_font} = 1; 

 # always reset the chars counter each time we try a font
 $chars_to_check = length("#".@image_tags[0]);
 
 print "Selected font $watermark_font (to check: $fonts_to_check)\n";
 
 # if not yet already, build list of available chars with this font
 unless ($fonts_characters{$watermark_font}) {
 Font::FreeType->new->face($watermark_font)->foreach_char(
 sub {
 my $char_chr = chr($_->char_code);
 my $char_code = $_->char_code;
 $fonts_characters{$watermark_font}{$char_chr} = $char_code;
 });
 print "Slurped $watermark_font chars\n";
 }
 
 # then check if every available character of the watermark exists in this font
 for (split //, "#".@image_tags[0]) {
 print "Check $_\n";
 # breaks out if missing char
 last unless $fonts_characters{$watermark_font}{$_};
 # otherwise decrement counter of chars to check: if we reach 0, they are all valid
 # and we should get out of the font picking loop 
 $chars_to_check--;
 print "Chars still to check $chars_to_check\n";
 }
 
 # we also record there is one less font to check
 $fonts_to_check--;
 
}


print "FONT PICKED $watermark_font\n";

This code is actually included in my post-image-to-tumblr.pl script (hence the variables name).

Obviously, if no font is suitable, it’ll take the last one tested. It won’t go as far as comparing which one is the most suitable, since in the context of this script, if no fonts can fully render a tag, the only sensible course is to add more (unicode capable) fonts to the fonts/ directory.

Advertisements

Lightest terminal: urxvt in daemon/client mode?

Thinkpad 600EI still use an old IBM Thinkpad 600E that I bought second hand a decade ago.

It still works. Well, the battery is dead, I added RAM as much as the motherboard can handle (2x 128 MB DIMM modules + 50 MB onboard module – something like that) and I changed, several years ago, the hard-drive, replacing the stock one with a more recent rescued from a short-lived Acer Aspire that belonged to my brother.

It still works. Sure, it is subject to bugs that will probably never get fixed, but none that you can’t work around.

It still works. But… But it is not a very fast computer however. It is not really that it runs slower than in the past. It is not really that we get used to faster computer. The fact is that software developers have not much reasons to write code light enough to run smoothly on this old piece of junk. So they don’t, most of them.

You end up running obsolete software or be very glad to find pieces of software like Midori (lightweight webbrowser based on WebKit).

Yes, yes, I will get to the point.

The point is whatever you can get can make a difference. I tried to run dash instead of bash. Bleua. That’s fine for scripts. But I cannot live with no completion at all. So I stayed with the Bourne Again Shell. But I had to cut of most of completions (how now, the endless scripts in /etc/bash_completion.d) to avoid waiting hours to get a shell to start.

So here comes urxvt. urxvt is fast. Like aterm. But it supports UTF-8. And it matters.
But the really nice thing is that urvxt includes a daemon/client mode. You just have to start the daemon at the begin of the X session, for instand in having the following in ~/.xsession :

#!/bin/dash
# terminal daemon
urxvtd -q -f -o
# desktop
export BROWSER=midori
wmaker

Then, everytime you need a terminal, call uxrvtc instead of urxvt.

You might also want to add in ~/.Xdefaults something like:

Rxvt*background: gray23
Rxvt*foreground: white
Rxvt*troughColor: gray33
Rxvt*scrollColor: gray13
Rxvt*scrollstyle: plain

Rxvt*visualBell: true
Rxvt*saveLines: 2000
Rxvt*urlLauncher: midori
Rxvt*scrollTtyOutput: false
Rxvt*scrollWithBuffer: true

Rxvt*color12: SkyBlue2

There are no real-life drawbacks that I encountered so far.

March 9, 2010 Update: After reading interesting test about terms perfs (in french) which shows how slow xterm is, how fast konsole is, but how fast and with less memory usage urxvt is, I even started using urxvt on my usual workstation that runs KDE with the following .Xdefaults (here, we use transparency and font anti-aliasing, as the hardware can obviously handle it)

urxvt*visualBell: true
urxvt*saveLines: 12000
urxvt*urlLauncher: konqueror
urxvt*scrollTtyOutput: false
urxvt*scrollWithBuffer: true

urxvt*depth: 32
urxvt*background: rgba:0000/0000/0000/dddd
urxvt*borderColor: rgba:0000/0000/0000/dddd
urxvt*foreground: white

urxvt*troughColor: gray33
urxvt*scrollColor: rgba:0000/0000/0000/0fff
urxvt*scrollstyle: plain
urxvt*scrollBar_right: true

urxvt*font: xft:Bitstream Vera Sans Mono:style=Regular:pixelsize=13:antialias=true
urxvt*color12: SkyBlue2

I even added the following as /etc/X11/Xsession.d/74urxvtd_start


# In order to activate urxvt daemon at X session launch
# simply place use-urxvtd into your /etc/X11/Xsession.options file

URXVTD=/usr/bin/urxvtd
URXVTD_OPTIONS="-q -f -o"

if grep -qs ^use-urxvtd "$OPTIONFILE"; then
if [ -x "$URXVTD" ]; then
$URXVTD $URXVTD_OPTIONS
fi
fi

and added the string use-urxvtd in /etc/X11/Xsession.options but that’s a bit overkill as debian already provides urxvtcd that fires up urxvtc while making sure urxwtd is running.