[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: subtle sort bug?
From: |
gregory mott |
Subject: |
Re: subtle sort bug? |
Date: |
03 Jul 2003 15:07:13 +0100 |
On Tue, 2003-07-01 at 23:18, Paul Eggert wrote:
> gregory mott <address@hidden> writes:
> > can you point me to an appropriate RTFM that ideally would layout what
> > encodings are used by what locales, or how to tell what encoding you
> > have/need, etc usw?
>
> Sorry, no; this stuff tends to be scattered around all over the place.
>
> On my Debian GNU/Linux 3.0r1 system, the file
> /usr/share/i18n/SUPPORTED lists the encodings used by locales, but
> things may be different on your system.
>
> For general info about encodings you might try www.li18nux.org and/or
> Ken Lunde's book on encodings and character sets
> <http://www.praxagora.com/lunde/cjkv-ip.html>.
i've read things hither and yon, i remain in the dark..
when i pass textual input to sort, how does sort come to decide or infer
the encoding?
you seem to say that a locale is associated with a particular encoding.
well, hmm. on rh9, the locale definitions (eg
/usr/share/i18n/locales/en_IN) appear to be in unicode. i do not see
where a locale becomes associated with any particular encoding (such as
UTF-8 or ISO-8859-15).
it seems i can "fix" the en_AU "failure" by specifying:
$ LC_CTYPE=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 sort /tmp/sos
groan
grosr
groß
grost
red
résumé
resumed
but that approach doesn't seem to help my personal locale definition:
$ LC_CTYPE=g.UTF-8 LC_COLLATE=g.UTF-8 sort /tmp/sos
groan
grosr
grost
groß
red
resumed
résumé
i fail to understand. i've used the same stock definitions:
# ---> /usr/share/i18n/locales/g <---
# build with:
# localedef -i g -c g
LC_CTYPE
copy "i18n"
END LC_CTYPE
LC_COLLATE
copy "iso14651_t1"
END LC_COLLATE
LC_TIME
d_fmt "<U0025><U0059><U002F><U0025><U006D><U002F><U0025><U0064>"
END LC_TIME
can you/anyone give me a clue?