[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Locales
From: |
Vadim Zeitlin |
Subject: |
Re: [lmi] Locales |
Date: |
Fri, 16 Jan 2015 15:50:27 +0100 |
On Thu, 15 Jan 2015 04:31:19 +0000 Greg Chicares <address@hidden> wrote:
GC> Back in 2010, cygwin adopted upstream changes that caused
GC> 'grep' behavior to diverge from ASCII, so I followed the
GC> recommendation here:
GC> https://www.cygwin.com/ml/cygwin/2010-12/msg00088.html
GC> and set
GC> LC_COLLATE=C.UTF-8
GC> so that [a-z] means {abcdefghijklmnopqrstuvwxyz}.
I'm not sure what is the reason for setting just LC_COLLATE and not
LC_ALL here?
GC> Now I'm using a very recent version of cygwin that has this
GC> further upstream change:
GC> https://www.cygwin.com/ml/cygwin/2014-12/msg00356.html
GC> due to which 'grep' now considers some lmi source files to be
GC> binary. For example:
GC> grep -w SpecAmtLoad *.?pp
GC> ...
GC> Binary file ledger_xml_io.cpp matches
Ah, interesting, I didn't see this yet.
GC> That 2014 message recommends:
GC> LC_ALL=C grep
GC> which would override my LC_COLLATE workaround. Apparently I
GC> could just set
GC> export LC_CTYPE=C
GC> instead, and keep my LC_COLLATE setting; does that seem silly?
Again, maybe I'm just not being sophisticated enough here, but what's
wrong with "export LC_ALL=C" if you want to work with ASCII only? Of
course, "LC_ALL=C.UTF-8" is even better, but this does rely on having
everything in UTF-8.
IMHO the best solution would be to just convert the few[*] lmi files using
non-ASCII characters to UTF-8 and set LC_ALL=C.UTF-8.
GC> This also works:
GC> export LANG=C
GC> Is that sillier, or less silly?
I wouldn't call this silly, but it doesn't make much sense to use C as a
fallback (which is what LANG does) to me, AFAICS you want to force the use
of C locale and from this point of view setting either LC_ALL (once and for
all) or individual LC_XXX would seem to make more sense.
GC> tasteful practice, and don't want to adopt something that turns
GC> out to be outlandish.
I am not aware of any standard/best practice concerning setting LC_ALL vs
LANG. The only thing I know is that Linux distributions are supposed to set
LANG by default in their standard rc files and leave LC_XXX for the user to
override if he wishes, but this isn't really relevant here.
To summarize, I think you just should set LC_ALL=C. But converting
everything to UTF-8 and using LC_ALL=C.UTF-8 would be the best IMHO.
Regards,
VZ
[*] The few:
% for f in *.?pp; (iconv -f utf8 -t utf8 $f &>/dev/null || echo $f)
config.hpp
global_settings.hpp
interest_rates.cpp
ledger_xml_io.cpp
main_cli.cpp
null_stream.cpp
platform_dependent.hpp
progress_meter.hpp
round_test.cpp
skeleton.cpp
snprintf_test.cpp
tn_range.hpp
value_cast_test.cpp
- [lmi] Locales, Greg Chicares, 2015/01/14
- Re: [lmi] Locales,
Vadim Zeitlin <=