A Short i18n/l10n Example in Ada
This started with me trying, in probably a ham-fisted way, to apply the recommendations in a gemlog post I liked.
Any mistakes in the following are my fault.
First, I started with the usual "alr init", followed by "alr with intl", then edited the main .adb file to be the following:
with Intl;
with Ada.Strings.Wide_Bounded;
with Ada.Strings.UTF_Encoding.Wide_Strings;
with Ada.Wide_Text_IO;
procedure Hello is
PROG_NAME : constant String := "hello";
function Gettext (Message : in String) return String is (Intl.Gettext (Message));
package Dyn is new Ada.Strings.Wide_Bounded.Generic_Bounded_Length (Max => 100);
Message : Dyn.Bounded_Wide_String;
begin
Intl.Initialize (PROG_NAME, "/home/user/locale");
Message := Dyn.To_Bounded_Wide_String (Source => Ada.Strings.UTF_Encoding.Wide_Strings.Decode (Gettext ("Hello world!")));
Ada.Wide_Text_IO.Put_Line (Dyn.To_Wide_String (Message));
end Hello;
In my opinion, even though Ada can be verbose this is reasonably short and readable. Some salient points:
- Need to use Wide_ variants of packages. I chose this instead of Wide_Wide_ out of pure stubborness--I care a lot about modern East Asian languages where a big effort was made with Han unification to fit in the Basic Multilingual Plane, and not at all about emojis.
- I had to use Gettext instead of the shorter "-" to work with the xgettext utility (see below).
- The use of the `Message` variable is artificial, just for experimentation. The result of `Decode` could have been passed directly to `Put_Line`.
Now the rest is mostly done at a shell prompt:
xgettext --from-code=UTF-8 --keyword=Gettext --add-comments -C -o hello.pot src/hello.adb msginit --input=hello.pot --locale=ja --output=ja.po
Note that for longer Ada programs you might have unmatched single quotes, e.g. for attributes like `'Value` and so have to add comments like `-- '` just to keep xgettext happy. Now change the encoding in ja.po to UTF-8 if necessary, and replace the empty string translation with "こんにちは世界!". Next, some more shell commands:
mkdir -p $HOME/locale/ja/LC_MESSAGES msgfmt --output-file=$HOME/locale/ja/LC_MESSAGES/hello.mo ja.po export LANGUAGE=ja ./bin/hello
And it does what you expect. If LANGUAGE is unset, or no translation exists, the fallback (English) is used. All-in-all, pretty painless. I'm interested in language processors, so have some advice there:
- Don't internationalize keywords & variable/constant names. France tried this, it's pointless since programming languages are closer to Maths than English anyway.
- It's a tough call, but I think comments should be in English if you want anyone else to help with maintenance. In an ideal world everyone would be a polyglot, but ...
- Signon banners, progress messages, etc. should be internationalized.
- Error messages should be internationalized.
I've been doing other stuff, but it seemed better captured in PRs, defect reports, etc. instead of gemlog entries. I wholeheartedly recommend the Distributed Systems Annex!