[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
POSIX gettext() and the installation directories for .mo files
From: |
Bruno Haible |
Subject: |
POSIX gettext() and the installation directories for .mo files |
Date: |
Tue, 04 May 2021 00:42:38 +0200 |
User-agent: |
KMail/5.1.3 (Linux/4.4.0-206-generic; KDE/5.18.0; x86_64; ; ) |
https://posix.rhansen.org/p/gettext_split
says (line 77..79)
"For each locale name in LANGUAGE, or if LANGUAGE is not set or is
empty, or no suitable messages object is found in processing LANGUAGE,
the pathname used to locate the messages object shall be
dirname/localename/categoryname/textdomainname.mo, where:
...
For the LANGUAGE search, the localename part is each locale name from
LANGUAGE in turn. For the single-locale search, the localename part
is the name of the current locale, or the locale specified in an *_l()
function call, for the category named by categoryname."
This is NOT how GNU gettext behaves. If POSIX standardizes it like this,
GNU libc and GNU gettext will have the choice among
(a) looking in different (and fewer) directories than they do today,
causing major i18n dysfunctionality to users, until the users
have set up lots of symbolic links between directories, or
(b) violating POSIX in this point.
I will vote for (b).
Namely, what GNU gettext does is to look in SEVERAL (not ONE) directories
per LANGUAGE element.
The localename parts of these directories are constructed from the language
identifier (element of LANGUAGE) or locale name. For example:
* The language identifier 'de' gives rise to the localename part
de
* The language identifier 'de_AT' gives rise to the localename parts
de_AT
de
* The locale name 'de_AT.UTF-8' gives rise to the localename parts
de_AT.UTF-8
de_AT.utf8
de_AT
de.UTF-8
de.utf8
de
* The locale name 'uz_UZ.UTF-8@cyrillic gives rise to the localename parts
uz_UZ.UTF-8@cyrillic
uz_UZ.utf8@cyrillic
uz_UZ@cyrillic
uz.UTF-8@cyrillic
uz.utf8@cyrillic
uz@cyrillic
uz_UZ.UTF-8
uz_UZ.utf8
uz_UZ
uz.UTF-8
uz.utf8
uz
This list of directories is important for people who live in communities
which often (but not always) have translations of their own but can read
translations for other locales. In the examples above:
* A user in Austria prefers translations for Austrian German, but can
also read German with no problem.
* A user in Uzbekistan may prefer translations in Cyrillic but can also
read translations in Latin. [1]
If above text was adopted, it would have the consequences that
1) Many symbolic links are needed in /usr/share/locale/. Solaris 11.4
is a system that implements gettext() as described in above text,
and it has the links shown below [2].
2) Users who want to create a new locale (e.g. for English in Australia)
will have to create a symlink
/usr/share/locale/en_AU -> /usr/share/locale/en
and so on for each custom locale.
3) Users who install packages in non-privileged directories (for GNU
programs, that's the --prefix=PREFIX option) will have to create the
same amount of symbolic links in their PREFIX/share/locale/ directory.
4) Users will have to set fallback logic in their LANGUAGE environment
variable
LANGUAGE=de_AT:de_DE
instead of having it built-in:
LANGUAGE=de_AT
This is BAD, BAD, BAD.
Bruno
[1] https://en.wikipedia.org/wiki/Uzbek_alphabet
[2]
$ ls -l /usr/share/locale
total 102
drwxr-xr-x 3 root other 3 Oct 13 2018 C
drwxr-xr-x 3 root other 4 Oct 13 2018 de
lrwxrwxrwx 1 root root 2 Oct 13 2018 de_DE -> de
lrwxrwxrwx 1 root root 2 Oct 13 2018 de_DE.ISO8859-1 -> de
lrwxrwxrwx 1 root root 2 Oct 13 2018 de_DE.ISO8859-15 -> de
lrwxrwxrwx 1 root root 2 Oct 13 2018 de_DE.UTF-8 -> de
lrwxrwxrwx 1 root root 2 Oct 13 2018 de.ISO8859-15 -> de
drwxr-xr-x 3 root other 3 Oct 13 2018 de.us-ascii
lrwxrwxrwx 1 root root 2 Oct 13 2018 de.UTF-8 -> de
drwxr-xr-x 3 root other 3 Oct 13 2018 en
drwxr-xr-x 3 root other 3 Oct 13 2018 en_US
drwxr-xr-x 3 root other 3 Oct 13 2018 en@boldquot
drwxr-xr-x 3 root other 3 Oct 13 2018 en@quot
drwxr-xr-x 3 root other 3 Oct 13 2018 en@shaw
drwxr-xr-x 3 root other 4 Oct 13 2018 es
drwxr-xr-x 3 root other 3 Oct 13 2018 es_ES
lrwxrwxrwx 1 root root 2 Oct 13 2018 es_ES.ISO8859-1 -> es
lrwxrwxrwx 1 root root 2 Oct 13 2018 es_ES.ISO8859-15 -> es
lrwxrwxrwx 1 root root 2 Oct 13 2018 es_ES.UTF-8 -> es
lrwxrwxrwx 1 root root 2 Oct 13 2018 es.ISO8859-15 -> es
lrwxrwxrwx 1 root root 2 Oct 13 2018 es.UTF-8 -> es
drwxr-xr-x 3 root other 4 Oct 13 2018 fr
lrwxrwxrwx 1 root root 2 Oct 13 2018 fr_FR -> fr
lrwxrwxrwx 1 root root 2 Oct 13 2018 fr_FR.ISO8859-1 -> fr
lrwxrwxrwx 1 root root 2 Oct 13 2018 fr_FR.ISO8859-15 -> fr
lrwxrwxrwx 1 root root 2 Oct 13 2018 fr_FR.UTF-8 -> fr
lrwxrwxrwx 1 root root 2 Oct 13 2018 fr.ISO8859-15 -> fr
lrwxrwxrwx 1 root root 2 Oct 13 2018 fr.UTF-8 -> fr
drwxr-xr-x 3 root other 4 Oct 13 2018 it
lrwxrwxrwx 1 root root 2 Oct 13 2018 it_IT -> it
lrwxrwxrwx 1 root root 2 Oct 13 2018 it_IT.ISO8859-1 -> it
lrwxrwxrwx 1 root root 2 Oct 13 2018 it_IT.ISO8859-15 -> it
lrwxrwxrwx 1 root root 2 Oct 13 2018 it_IT.UTF-8 -> it
lrwxrwxrwx 1 root root 2 Oct 13 2018 it.ISO8859-15 -> it
lrwxrwxrwx 1 root root 2 Oct 13 2018 it.UTF-8 -> it
drwxr-xr-x 3 root other 4 Oct 13 2018 ja
lrwxrwxrwx 1 root root 2 Oct 13 2018 ja_JP.eucJP -> ja
lrwxrwxrwx 1 root root 2 Oct 13 2018 ja_JP.PCK -> ja
lrwxrwxrwx 1 root root 2 Oct 13 2018 ja_JP.UTF-8 -> ja
drwxr-xr-x 3 root other 4 Oct 13 2018 ko
lrwxrwxrwx 1 root root 2 Oct 13 2018 ko_KR.EUC -> ko
lrwxrwxrwx 1 root root 2 Oct 13 2018 ko_KR.UTF-8 -> ko
lrwxrwxrwx 1 root root 2 Oct 13 2018 ko.UTF-8 -> ko
drwxr-xr-x 3 root other 4 Oct 13 2018 pt
drwxr-xr-x 3 root other 4 Oct 13 2018 pt_BR
lrwxrwxrwx 1 root root 5 Oct 13 2018 pt_BR.ISO8859-1 -> pt_BR
drwxr-xr-x 3 root other 3 Oct 13 2018 pt_BR.us-ascii
lrwxrwxrwx 1 root root 5 Oct 13 2018 pt_BR.UTF-8 -> pt_BR
lrwxrwxrwx 1 root root 2 Oct 13 2018 pt.ISO8859-15 -> pt
drwxr-xr-x 3 root other 3 Oct 13 2018 pt.us-ascii
lrwxrwxrwx 1 root root 5 Oct 13 2018 zh -> zh_CN
drwxr-xr-x 3 root other 4 Oct 13 2018 zh_CN
lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_CN.EUC -> zh_CN
lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_CN.GB18030 -> zh_CN
lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_CN.GBK -> zh_CN
lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_CN.UTF-8 -> zh_CN
drwxr-xr-x 3 root other 4 Oct 13 2018 zh_TW
lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_TW.BIG5 -> zh_TW
lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_TW.EUC -> zh_TW
lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_TW.UTF-8 -> zh_TW
lrwxrwxrwx 1 root root 5 Oct 13 2018 zh.GBK -> zh_CN
lrwxrwxrwx 1 root root 5 Oct 13 2018 zh.UTF-8 -> zh_CN
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- POSIX gettext() and the installation directories for .mo files,
Bruno Haible <=