bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gettext locale environment variable documentation


From: Bruno Haible
Subject: Re: gettext locale environment variable documentation
Date: Sun, 3 Jun 2007 12:44:29 +0200
User-agent: KMail/1.5.4

Karl Berry wrote:
> Third, also in the gettext grok node, there is this item:
>    3. `LC_xxx', according to selected locale 
> I don't understand why the phrase "according to the selected locale" is
> there.  I surmised that xxx meant names like "MESSAGES", e.g.,
> "LC_MESSAGES".  How is that locale-dependent?  Does "MESSAGES" get
> translated (LC_ANZEIGEN, babelfish tells me :)?!  Can you explain,
> please?

No wonder that you don't understand this: the manual confuses the terms
"locale" and "locale category". And moreover, a "locale category" is not
a category of locales... Find attached a doc change that should make it
clearer.

Thanks for reporting this major doc bug.

> P.S. Maybe also worth mentioning that "POSIX" is the same as "C" for a
> locale name?

I don't think it's worth mentioning: I've never seen people using the
"POSIX" locale. They all prefer to use the "C" locale, since it's identical
and shorter to write.

Bruno


--- gettext.texi        27 May 2007 21:07:56 -0000      1.124
+++ gettext.texi        3 Jun 2007 10:35:44 -0000
@@ -724,7 +724,7 @@
 termed the country's locale.  The locale represents the knowledge
 needed to support the country's native attributes.
 
address@hidden locale facets
address@hidden locale categories
 There are a few major areas which may vary between countries and
 hence, define what a locale must describe.  The following list helps
 putting multi-lingual messages into the proper context of other tasks
@@ -736,7 +736,7 @@
 @cindex codeset
 @cindex encoding
 @cindex character encoding
address@hidden locale facet, LC_CTYPE
address@hidden locale category, LC_CTYPE
 
 The codeset most commonly used through out the USA and most English
 speaking parts of the world is the ASCII codeset.  However, there are
@@ -751,7 +751,7 @@
 
 @item Currency
 @cindex currency symbols
address@hidden locale facet, LC_MONETARY
address@hidden locale category, LC_MONETARY
 
 The symbols used vary from country to country as does the position
 used by the symbol.  Software needs to be able to transparently
@@ -759,7 +759,7 @@
 
 @item Dates
 @cindex date format
address@hidden locale facet, LC_TIME
address@hidden locale category, LC_TIME
 
 The format of date varies between locales.  For example, Christmas day
 in 1994 is written as 12/25/94 in the USA and as 25/12/94 in Australia.
@@ -772,7 +772,7 @@
 
 @item Numbers
 @cindex number format
address@hidden locale facet, LC_NUMERIC
address@hidden locale category, LC_NUMERIC
 
 Numbers can be represented differently in different locales.
 For example, the following numbers are all written correctly for
@@ -791,7 +791,7 @@
 
 @item Messages
 @cindex messages
address@hidden locale facet, LC_MESSAGES
address@hidden locale category, LC_MESSAGES
 
 The most obvious area is the language support within a locale.  This is
 where GNU @code{gettext} provides the means for developers and users to
@@ -800,6 +800,17 @@
 
 @end table
 
address@hidden locale categories
+These areas of cultural conventions are called @emph{locale categories}.
+It is an unfortunate term; @emph{locale aspects} or @emph{locale feature
+categories} would be a better term, because each ``locale category''
+describes an area or task that requires localization.  The concrete data
+that describes the cultural conventions for such an area and for a particular
+culture is also called a @emph{locale category}.  In this sense, a locale
+is composed of several locale categories: the locale category describing
+the codeset, the locale category describing the formatting of numbers,
+the locale category containing the translated messages, and so on.
+
 @cindex Linux
 Components of locale outside of message handling are standardized in
 the ISO C standard and the SUSV2 specification.  GNU @code{libc}
@@ -1584,11 +1595,11 @@
 @file{config.h} or by the Makefile.  For now consult the @code{gettext}
 or @code{hello} sources for more information.
 
address@hidden locale facet, LC_ALL
address@hidden locale facet, LC_CTYPE
address@hidden locale category, LC_ALL
address@hidden locale category, LC_CTYPE
 The use of @code{LC_ALL} might not be appropriate for you.
 @code{LC_ALL} includes all locale categories and especially
address@hidden  This later category is responsible for determining
address@hidden  This latter category is responsible for determining
 character classes with the @code{isalnum} etc. functions from
 @file{ctype.h} which could especially for programs, which process some
 kind of input language, be wrong.  For example this would mean that a
@@ -1596,8 +1607,8 @@
 France but not in the U.S.
 
 Some systems also have problems with parsing numbers using the
address@hidden functions if an other but the @code{LC_ALL} locale is used.
-The standards say that additional formats but the one known in the
address@hidden functions if an other but the @code{LC_ALL} locale category is
+used.  The standards say that additional formats but the one known in the
 @code{"C"} locale might be recognized.  But some systems seem to reject
 numbers in the @code{"C"} locale format.  In some situation, it might
 also be a problem with the notation itself which makes it impossible to
@@ -1621,13 +1632,13 @@
 @end group
 @end example
 
address@hidden locale facet, LC_CTYPE
address@hidden locale facet, LC_COLLATE
address@hidden locale facet, LC_MONETARY
address@hidden locale facet, LC_NUMERIC
address@hidden locale facet, LC_TIME
address@hidden locale facet, LC_MESSAGES
address@hidden locale facet, LC_RESPONSES
address@hidden locale category, LC_CTYPE
address@hidden locale category, LC_COLLATE
address@hidden locale category, LC_MONETARY
address@hidden locale category, LC_NUMERIC
address@hidden locale category, LC_TIME
address@hidden locale category, LC_MESSAGES
address@hidden locale category, LC_RESPONSES
 @noindent
 On all POSIX conformant systems the locale categories @code{LC_CTYPE},
 @code{LC_MESSAGES}, @code{LC_COLLATE}, @code{LC_MONETARY},
@@ -5281,8 +5292,8 @@
 returned.  If the argument is @code{NULL} the result is undefined.
 
 One thing which should come into mind is that no explicit dependency to
-the used domain is given.  The current value of the domain for the
address@hidden locale is used.  If this changes between two
+the used domain is given.  The current value of the domain is used.
+If this changes between two
 executions of the same @code{gettext} call in the program, both calls
 reference a different message catalog.
 
@@ -5322,7 +5333,7 @@
 
 Both take an additional argument at the first place, which corresponds
 to the argument of @code{textdomain}.  The third argument of
address@hidden allows to use another locale but @code{LC_MESSAGES}.
address@hidden allows to use another locale category but @code{LC_MESSAGES}.
 But I really don't know where this can be useful.  If the
 @var{domain_name} is @code{NULL} or @var{category} has an value beside
 the known ones, the result is undefined.  It should also be noted that
@@ -5364,8 +5375,8 @@
 files.  The way usually used in Unix environments is have this encoding
 in the file name.  This is also done here.  The directory name given in
 @code{bindtextdomain}s second argument (or the default directory),
-followed by the value and name of the locale and the domain name are
-concatenated:
+followed by the name of the locale, the locale category, and the domain name
+are concatenated:
 
 @example
 @var{dir_name}/@var{locale}/address@hidden/@var{domain_name}.mo
@@ -5378,18 +5389,19 @@
 @end example
 
 @noindent
address@hidden is the value of the locale whose name is this
address@hidden is the name of the locale category which is designated by
 @address@hidden  For @code{gettext} and @code{dgettext} this
 @address@hidden is always @address@hidden
 system, eg Ultrix, don't have @code{LC_MESSAGES}.  Here we use a more or
 less arbitrary value for it, namely 1729, the smallest positive integer
 which can be represented in two different ways as the sum of two cubes.}
-The value of the locale is determined through
+The name of the locale category is determined through
 @code{setlocale (address@hidden, NULL)}.
 @footnote{When the system does not support @code{setlocale} its behavior
 in setting the locale values is simulated by looking at the environment
 variables.}
address@hidden specifies the locale category by the third argument.
+When using the function @code{dcgettext}, you can specify the locale category
+through the third argument.
 
 @node Charset conversion, Contexts, Locating Catalogs, gettext
 @subsection How to specify the output character set @code{gettext} uses
@@ -5510,7 +5522,7 @@
 These are generalizations of @code{pgettext}.  They behave similarly to
 @code{dgettext} and @code{dcgettext}, respectively.  The @var{domain_name}
 argument defines the translation domain.  The @var{category} argument
-allows to use another locale facet than @code{LC_MESSAGES}.
+allows to use another locale category than @code{LC_MESSAGES}.
 
 As as example consider the following fictional situation.  A GUI program
 has a menu bar with the following entries:
@@ -6254,7 +6266,7 @@
 @vindex address@hidden, environment variable}
 @vindex address@hidden, environment variable}
 @vindex address@hidden, environment variable}
address@hidden @code{LC_xxx}, according to selected locale
address@hidden @code{LC_xxx}, according to selected locale category
 @vindex address@hidden, environment variable}
 @item @code{LANG}
 @end enumerate
@@ -6394,7 +6406,7 @@
 @code{libintl} code with their software.
 
 Message catalog support is however only the tip of the iceberg.
-What about the data for the other locale categories.  They also have
+What about the data for the other locale categories?  They also have
 a number of deficiencies.  Are we going to abandon them as well and
 develop another duplicate set of routines (should @code{libintl}
 expand beyond message catalog support)?
@@ -8262,7 +8274,7 @@
 @code{dcgettext}, @code{dcngettext} available from within the language.
 These functions are less often used, but are nevertheless necessary for
 particular purposes: @code{ngettext} for correct plural handling, and
address@hidden and @code{dcngettext} for obeying other locale
address@hidden and @code{dcngettext} for obeying other locale-related
 environment variables than @code{LC_MESSAGES}, such as @code{LC_TIME} or
 @code{LC_MONETARY}.  For these latter functions, you need to make the
 @code{LC_*} constants, available in the C header @code{<locale.h>},
@@ -8281,7 +8293,7 @@
 You should either perform a @code{setlocale (LC_ALL, "")} call during
 the startup of your language runtime, or allow the programmer to do so.
 Remember that gettext will act as a no-op if the @code{LC_MESSAGES} and
address@hidden locale facets are not both set.
address@hidden locale categories are not both set.
 
 @item
 A programmer should have a way to extract translatable strings from a
@@ -8419,7 +8431,7 @@
 translation of @code{"%d"} can be @code{"%Id"}.  The effect of this flag,
 on systems with GNU @code{libc}, is that in the output, the ASCII digits are
 replaced with the @samp{outdigits} defined in the @code{LC_CTYPE} locale
-facet.  On other systems, the @code{gettext} function removes this flag,
+category.  On other systems, the @code{gettext} function removes this flag,
 so that it has no effect.
 
 Note that the programmer should @emph{not} put this flag into the





reply via email to

[Prev in Thread] Current Thread [Next in Thread]