bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: German uppercasing rules (was: supporting obscure languages)


From: Albert Cahalan
Subject: Re: German uppercasing rules (was: supporting obscure languages)
Date: Sat, 28 Nov 2009 16:36:39 -0500

On Sat, Nov 28, 2009 at 3:11 PM, Bruno Haible <address@hidden> wrote:
> Albert Cahalan wrote:

>> Maybe round-trip the case for U+1E9E, avoiding expansion troubles.
>
> Unicode 5.0 has introduced the character U+1E9E "LATIN CAPITAL LETTER SHARP 
> S",
> but the habits in Germany have not changed. The upper-case variant of "Ruß"
> is still "RUSS". German people don't care about whether this round-trips
> or not. "ß" uppercases to "SS". It has been like this for centuries.

Germans with "ß" in their last name are people too, and they care.
U+1E9E exists solely because there is real evidence that people care.
It is pretty common to uppercase "ß" as itself; clearly people care.

Sooner or later, a address@hidden locale will be demanded.
German rules have changed a number of times in the 1900s, and
they certainly can change again.

In any case, you won't be getting "SS" out of towupper.

> Therefore if you want your program to do case conversions right for German
> (and Turkish, Greek, Lithuanian etc.), you need to perform case conversions
> on entire strings, not merely on characters one by one. In C programs,
> you can use GNU libunistring [1] for this purpose. It has all the special 
> cases
> built-in.

Yes, of course, but that doesn't work for towupper.

I hope libunistring doesn't impede the evolution of languages.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]