coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What is necessary and sufficient to let 'sort' sort as if strcmp in


From: Bob Proulx
Subject: Re: What is necessary and sufficient to let 'sort' sort as if strcmp in C is used?
Date: Sat, 1 Feb 2014 15:03:27 -0700
User-agent: Mutt/1.5.21 (2010-09-15)

Peng Yu wrote:
> man sort says "Set LC_ALL=C to get the traditional sort order that
> uses native byte values."

Yes.

> man comm says "Note, comparisons honor the rules specified by 'LC_COLLATE'."

Yes.  No.  Almost.  It honors LANG if neither LC_COLLATE nor LC_ALL is
set.  It honors LC_COLLATE if LC_ALL is not set.  If LC_ALL is set
then it honors LC_ALL.  LC_ALL also overrides LC_CTYPE.

> My test shows that it seems LC_COLLATE=C is sufficient to make sort
> using native byte values. Is it so?

Yes.  No.  Almost.  LC_ALL overrides LC_COLLATE.  The three variables
in locale order are LANG, then LC_COLLATE, then LC_ALL.  LC_ALL also
overrides LC_CTYPE.

Setting LC_COLLATE mostly works fine.  I always set this in my environment.

  export LANG=en_US.UTF-8
  export LC_COLLATE=C

But while setting LC_COLLATE=C works for typical western locales there
is concern about others.  What will be the interaction with Chinese
big5 encoding for characters?  It probably own't behave in a desirable
way.  LC_ALL=C is probably required then to override LC_CTYPE.

Therefore while using LC_COLLATE alone works for some character
encodings it can't be definitively stated as working for all cases as
a general rule.  Setting LC_ALL can be stated as a general rule
because LC_ALL overrides LC_CTYPE while LC_COLLATE does not.

The locale behavior is controlled by libc.  If you have GNU libc
installed then the installed manual will match your system.

  info -f libc 'Locale Categories'

The most recent version is available on the web at the project site:

  
http://www.gnu.org/software/libc/manual/html_node/Locale-Categories.html#Locale-Categories

Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]