bug-textutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: textutils-2.1: sort bug


From: Randall Hopper
Subject: Re: textutils-2.1: sort bug
Date: Sat, 31 Aug 2002 14:57:05 -0400
User-agent: Mutt/1.3.28i

Hi Bob,

Thanks for the reply.

Bob Proulx:
 |Randall Hopper <address@hidden> [2002-08-31 10:32:29 -0400]:
 |> The GNU sort command (textutils-2.1) by default does this:
 |> 
 |>   --ignore-case
 |>   --dictionary-order
 |
 |Thanks for the report.  But you are mistaken.  That is not the default
 |behavior for sort.  Sort only behaves that way if specifically told to
 |behave that way.

Ok, but I'm not specifying these options.  And that's the behavior I'm
getting.

 |    Unless otherwise specified, all comparisons use the character
 |    collating sequence specified by the `LC_COLLATE' locale.
...
 |if LC_COLLATE is telling sort to sort differently then then it _must_
 |comply.
 |
 |Actually sort just passes the problem off to the C library.  The C
 |library routine strcoll() does all of the work.  It is useful to read

True, I see what you're saying.  If we presume that strcoll can fulfill the
interface requirements advertised by the sort man page, then this follows.
But the problem is that it doesn't.  Sort's interface implies ignore-case
and dictionary-order are not the default, yet they show up to be with
appropriate setting of environment variables.

Perhaps an --ignore-locale option might be helpful, so that GNU sort's
options actually mean something, and it can be made to work like other UNIX
sorts.

 |In which case it is not a bug in sort but behavior which is specifically
 |required.

Right, but this is derived requirement imposed only by sort's self-imposed
requirement to use strcoll.

 |> For example, when sorting directory listings, you end up getting less
 |> useful orderings like this, which I don't want.  I want the dot files
 |> clustered together (sorting by ASCII value, not skipping any characters):
 |> 
 |>    ./.gnupg/random_seed
 |>    ./HOUSE/BUYSELL/mortgage
 |>    ./HOUSE/TODO/20020824
 |>    ./HUMOR
 |>    ./.ICEauthority
 |
 |What does your 'locale' output say that you have specified for sort
 |order?

> locale
LANG=C
LC_CTYPE=en_US
LC_NUMERIC=en_US
LC_TIME=en_US
LC_COLLATE=en_US
LC_MONETARY=en_US
LC_MESSAGES=en_US
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=

 |Use the 'locale' command to print out what sorting order you have
 |configured in your environment.  If it does not say "C" or "POSIX"
 |then you have configured a non-standard sorting order.  

Ok.  This is the default environment set up on Mandrake 8.2 for US users.

 |Most generally reported is that your vendor set LANG for you
 |to en_US because they think you like it that way.  If you disagree then
 |you might consider filing a bug report with them.
...
 |Here is a standard reply.
 |
 |Bob
 |
 |Please check out the FAQ section on sort.
 |
 |  
http://www.gnu.org/software/fileutils/doc/faq/#Sort%20does%20not%20sorting%20i
 |+n%20normal%20order!

Ok.  Thanks for filling me in on what's going on, Bob!  And thanks for the
FAQ pointer.  I can easily override my environment to make this happen.  I
did search Usenet news via google before mailing this, but didn't come up
with any hits.  This FAQ pointer is exactly what I was hoping to find.

Randall




reply via email to

[Prev in Thread] Current Thread [Next in Thread]