bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: German uppercasing rules (was: supporting obscure languages)


From: Bruno Haible
Subject: Re: German uppercasing rules (was: supporting obscure languages)
Date: Sat, 28 Nov 2009 21:11:25 +0100
User-agent: KMail/1.9.9

Albert Cahalan wrote:
> Maybe round-trip the case for U+1E9E, avoiding expansion troubles.

Unicode 5.0 has introduced the character U+1E9E "LATIN CAPITAL LETTER SHARP S",
but the habits in Germany have not changed. The upper-case variant of "Ruß"
is still "RUSS". German people don't care about whether this round-trips
or not. "ß" uppercases to "SS". It has been like this for centuries.

Therefore if you want your program to do case conversions right for German
(and Turkish, Greek, Lithuanian etc.), you need to perform case conversions
on entire strings, not merely on characters one by one. In C programs,
you can use GNU libunistring [1] for this purpose. It has all the special cases
built-in.

Bruno

[1] http://www.gnu.org/software/libunistring/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]