bug-textutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sort folds case depending on LANG environment variable


From: Edward Avis
Subject: Re: sort folds case depending on LANG environment variable
Date: Mon, 16 Oct 2000 10:06:23 +0100 (BST)

On Sun, 15 Oct 2000, Bob Proulx wrote:

[LANG=en_US sort behaves differently to LANG=en sort]

>This has become a common problem with respect to Red Hat systems.  But
>GNU sort itself has never had this problem.

So it's a bug in Red Hat?  I'll report it to their bugzilla
database... hmm, there are are quite a few sort bugs in there.  Red Hat
seem to be highly skilled at breaking this program :-(

>The documentation does imply
>that locale specific comparisons are used, however.  Here is what it
>says:

>I am not sure how someone makes the connection between LC_COLLATE and
>LC_ALL and LANG but there is the implication that locales are used
>when comparing lines.

That's fair enough for comparing accented characters and other non-ASCII
things, but surely if the input is 7-bit ASCII it should be sorted the
way you'd expect.

But it looks like the problem is the locales.  I don't know who decided
that 'en' should have case-sensitive sorting while 'en_US' should
not.  I would expect any locale for a character set that is a superset
of ASCII (eg Latin-1, UTF-8) to have a sorting relation that is a
superset of ASCII order.

>>I'm not really sure why sort is looking at $LANG at all - it's not a
>>tool which users expect to be locale-aware.

>Actually users _are_ expecting it to be locale aware.  For one POSIX
>requires it.  For another the Internet is a global entity and has
>moved away from US-ASCII only software.

I quite agree but I don't think that internationalization should break
compatibility with the traditional behaviour.  Adding support for
different character sets is fine, but don't break ASCII in the process.

>Non-English speakers expect sorting to use the rules for their
>language.

But people who use sort - anglophone or otherwise - are probably
outweighed by the shell scripts, Makefiles and so on which run sort and
expect the standard ordering.  Also it is easier for people to change
their expectations than it is to change the mass of software relying
on the traditional ASCII ordering.

>Compile and installing the latest GNU textutils software and I am sure
>you will find the problem resolved.

Will do.

-- 
Ed Avis
address@hidden





reply via email to

[Prev in Thread] Current Thread [Next in Thread]