bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Accessing a message catalog independently of the system locales


From: Bruno Haible
Subject: Re: Accessing a message catalog independently of the system locales
Date: Sun, 24 Oct 2010 15:39:01 +0200
User-agent: KMail/1.9.9

Hi Sylvain,

> > > I'm initializing the locale using 'setlocale(LC_MESSAGES, "");'.
> > 
> > You also need to use setlocale(LC_CTYPE, ""); - see the doc at
> > <http://www.gnu.org/software/gettext/manual/html_node/Triggering.html>.
> 
> OK, I did my tests with and without.

If you initialize only the LC_MESSAGES category and not the LC_CTYPE
category, on glibc systems, non-ASCII characters are transliterated to
ASCII. Which doesn't look very decent for Chinese characters, for example...

> (Btw I'm concerned that a part of the code (the script parser) may be
> influenced by locale-specific behavior.)

A valid concern. Indeed you need to be careful that some libc functions,
such as isspace(), depend on the current locale. If you don't want this,
there are three approaches:
  - use a temporary call to setlocale(LC_ALL,"C") for the parsing code,
  - use a temporary call to uselocale(newlocale (LC_ALL_MASK, "C", NULL))
    for the parsing code,
  - use the gnulib modules c-ctype, c-strtod, c-strstr and similar.

> > > However (under GNU/Linux):
> > > 
> > > - If that locale is not enabled in the system, gettext will fail to
> > >   use it.
> > >   (I typically use 'dpkg-reconfigure locales' to enable a locale.)
> > 
> > The reason for this design choice in glibc is that gettext and LC_MESSAGES
> > is only part of internationalization.

The other reason is POSIX compliance. POSIX says that
  "For C-language programs, the POSIX locale shall be the
   default locale when the setlocale() function is not called."
And it is undistinguishable for the gettext functions whether your
program has called setlocale of a nonexistent locale or whether it
has not called setlocale at all. glibc is POSIX compliant and therefore
gettext() in glibc produces no translation in this case.

gettext() in libintl takes more liberty, violates POSIX, and produces
a translation according to the environment variables, regardless of the
setlocale() value. This is necessary because on non-glibc platforms,
users often have no way to install additional locales.

> I think it would be nice to still translate the message strings when
> the requested locale is not installed.

It would violate POSIX. The glibc developers decided against it.

> My concern is about simple language selection by the user.

Users can use the languages for which they have installed a locale.
It's that simple.

Some languages even require more than the locale: input methods for
Vietnamese, fonts for Japanese, etc. For this reasons, distributions
often have a per-language "language support" package that includes
all that's needed: locale, X locale support, input method, fonts, etc.

> (Btw, Debian has a 'locales-all' package now, with all locales
> pre-generated, but that's fairly recent, not installed by default, and
> I don't know about other distros / Unices, so I'm not sure that's a
> good-enough work-around.)

On openSUSE, which has a long tradition of desktop support and of
internationalization (it originated in Germany), over 400 locales are
installed by default:
  $ locale -a | wc -l
  442

Debian's culture is more focused on hackers, not desktop users. And
it has the principle to install the minimum amount of files. That
explains the difference, I think.

> In this particular case, users will want to try a game's translation
> (shipped as a .mo file), but to do that, they will have to configure
> the matching locale, which:
> - they don't know the mere existence of
> - is long (locales generation)
> - is complex (no simple GUI for my proverbial grand'ma)

For some distros it's as simple as launching the package installer,
selecting the "language support" package for language XY, and clicking OK.

If other distros make it harder than that, it's a problem with these
distros.

> this technique requires that the
> en_US.UTF-8 is installed, which typically is not the case for
> non-english GNU/Linux installs!

It would be good if every distribution would at least install the
en_US.UTF-8 locale, unconditionally. Feel free to lobby for it.

> Does it means that Windows essentially have all locales for all
> languages pre-installed (or created on request)?

True for some locale (those which don't require extra encodings or fonts).
For CJK locales, the user has to install them explicitly.

> My question is about an easy way for the user to select a translation.

It is described here:
<http://www.gnu.org/software/gettext/manual/html_node/Users.html>
The typical user is a user who knows his preferred language already at
installation time.

Bruno



reply via email to

[Prev in Thread] Current Thread [Next in Thread]