[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
locale problems in version 2.1
From: |
David Magagnosc |
Subject: |
locale problems in version 2.1 |
Date: |
Fri, 2 May 2003 14:05:09 -0400 |
Greetings,
This note contains both bug reports and suggestions.
- It seems to me that sorting a file on a field and then joining on
that field should always work. It doesn't. Here are a few lines
from a files to illustrate:
fPl|Femininum Plural|féminin pluriel
f Pl|Femininum Plural|féminin pluriel
fpl|Femininum Plural|féminin pluriel
f pl|Femininum Plural|féminin pluriel
This is allegedly sorted on the first column. Cutting the first column
and sorting interchanges lines 1 and 2 and lines 3 and 4, and then a
join produces only lines 2 and 4. Using the -s flag with the sort
doesn't help.
- I understand that I can set the LC_ALL variable in the environment to C
and that this will suppress the use of the locale. However, this is
too cumbersome, particularly in an installed base of makefiles and the
like that already use sort and join.
- It is easier to change the name of the program being invoked (we use
variables for them, anyway), so letting sort recognize that it was
invoked as sort_no_locale (or something like) would be relatively
easy to use.
- Alternatively, it would be nice to have an option to suppress the use
of locales.
- Now for a definite bug (whether the first is or not is open to
interpretation). I attempted to build a separate version that did
not use locales by changing HAVE_SETLOCALE to 0 in the config.h
file (I couldn't find a better way). Then I discovered that the
definition of the variable hard_LC_COLLATE (line 100 in sort) is
no longer defined, but it is used (lines 1366, 1523, and 2181).
If I truly had no setlocale, this wouldn't have built in the first
place.
- More suggestions for join: we have a local version with some
useful enhancements. They're currently in version 1.19, but I hope
to apply them to version 2.1, and I could likely send them along
to you. They include:
- failing by default when the files are not sorted. This usually
represents an error, and particularly if the join is buried
in a make may not be even remotely apparent.
- multiple column joins. I don't use this often, but when I do,
it's very, very useful.
- use of all of the sort order modifiers that sort supports, so
that I don't have to resort data to do a join if it happens to
be sorted, say, numerically.
- use of cut-style field specifiers with the -o flag. We have
files both with varying numbers of fields and with many fields.
In the first case, specifying all of the fields individually does
not work, and in the second, it is tedious.
All of these involve changes to the options that are backwards
compatible. Valid existing invocations function precisely as before.
Thanks,
David Magagnosc
Franklin Electronic Publishers
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- locale problems in version 2.1,
David Magagnosc <=