bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: supporting obscure languages


From: John Cowan
Subject: Re: supporting obscure languages
Date: Fri, 27 Nov 2009 13:17:20 -0500
User-agent: Mutt/1.5.13 (2006-08-11)

Bruno Haible scripsit:

> 1) You need to define a locale identifier for it. This is important,
>    because the users and all translators must agree on it - if a
>    translator uses a different identifier than the user, her
>    translations will not be found. The standardized identifiers
>    are those in ISO 639-1 and ISO 639-2, and also found in glibc's
>    glibc/locale/iso-639.def.

Is there any reason why ISO 639-3 identifiers cannot be used for
appropriate languages?  639-3 is much more comprehensive than 639-2, and
the identifiers correspond (that is, since 'haw' is Hawaiian in 639-2,
it has the same meaning in 639-3).

>    If your language is a distinct one, you should find the language
>    identifier in this list. If your language is a dialect of another
>    language, you can use a variant tag. For example, if by "zam" you
>    mean the language "Zapotec, Miahuatlán", it is a dialect of
>    Zapotec, which has the identifier "zap". So you will likely
>    choose the language identifier "address@hidden" (all ASCII please).

"Zapotec" is what 639-3 calls a macrolanguage: that is, it is a collection
of closely related languages that is for some purposes treated as a
single language.  The Zapotec macrolanguage encompasses 58 languages.
I emphasize that these are distinct languages, not at all mutually
intelligible.  Calling them "dialects of Zapotec" is exactly like calling
French, Spanish, and Italian "dialects of Latin": it reflects an old
unity that has long since been lost.

Furthermore, no one Zapotec language is either numerically or culturally
dominant: Isthmus Zapotec (zai), the largest, has perhaps 85,000 speakers
out of a total Zapotec-speaking population of 500,000.  This makes it
quite different from better known macrolanguages such as Arabic (which
encompasses about 30 languages, with Standard Arabic culturally but
not numerically dominant) and Chinese (which encompasses 13 languages,
with Mandarin both culturally and numerically dominant).

In short, unless there is some technical barrier to using 639-3 code
elements, it is more appropriate to code this language as "zam" rather
than as "address@hidden".

-- 
The Imperials are decadent, 300 pound   John Cowan <address@hidden>
free-range chickens (except they have   http://www.ccil.org/~cowan
teeth, arms instead of wings, and
dinosaurlike tails).                        --Elyse Grasso




reply via email to

[Prev in Thread] Current Thread [Next in Thread]