[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

locale problems in version 2.1

From: David Magagnosc
Subject: locale problems in version 2.1
Date: Fri, 2 May 2003 14:05:09 -0400


This note contains both bug reports and suggestions.

- It seems to me that sorting a file on a field and then joining on
  that field should always work.  It doesn't.  Here are a few lines
  from a files to illustrate:

        fPl|Femininum Plural|féminin pluriel
        f Pl|Femininum Plural|féminin pluriel
        fpl|Femininum Plural|féminin pluriel
        f pl|Femininum Plural|féminin pluriel

  This is allegedly sorted on the first column.  Cutting the first column
  and sorting interchanges lines 1 and 2 and lines 3 and 4, and then a
  join produces only lines 2 and 4.  Using the -s flag with the sort
  doesn't help.

- I understand that I can set the LC_ALL variable in the environment to C
  and that this will suppress the use of the locale.  However, this is
  too cumbersome, particularly in an installed base of makefiles and the
  like that already use sort and join.

- It is easier to change the name of the program being invoked (we use
  variables for them, anyway), so letting sort recognize that it was
  invoked as sort_no_locale (or something like) would be relatively
  easy to use.

- Alternatively, it would be nice to have an option to suppress the use
  of locales.

- Now for a definite bug (whether the first is or not is open to
  interpretation).  I attempted to build a separate version that did
  not use locales by changing HAVE_SETLOCALE to 0 in the config.h
  file (I couldn't find a better way).  Then I discovered that the
  definition of the variable hard_LC_COLLATE (line 100 in sort) is
  no longer defined, but it is used (lines 1366, 1523, and 2181).
  If I truly had no setlocale, this wouldn't have built in the first

- More suggestions for join:  we have a local version with some
  useful enhancements.  They're currently in version 1.19, but I hope
  to apply them to version 2.1, and I could likely send them along
  to you.  They include:
     - failing by default when the files are not sorted.  This usually
       represents an error, and particularly if the join is buried
       in a make may not be even remotely apparent.
     - multiple column joins.  I don't use this often, but when I do,
       it's very, very useful.
     - use of all of the sort order modifiers that sort supports, so
       that I don't have to resort data to do a join if it happens to
       be sorted, say, numerically.
     - use of cut-style field specifiers with the -o flag.  We have
       files both with varying numbers of fields and with many fields.
       In the first case, specifying all of the fields individually does
       not work, and in the second, it is tedious.
  All of these involve changes to the options that are backwards
  compatible.  Valid existing invocations function precisely  as before.


David Magagnosc
Franklin Electronic Publishers

reply via email to

[Prev in Thread] Current Thread [Next in Thread]