lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Locales


From: Vadim Zeitlin
Subject: Re: [lmi] Locales
Date: Fri, 16 Jan 2015 15:50:27 +0100

On Thu, 15 Jan 2015 04:31:19 +0000 Greg Chicares <address@hidden> wrote:

GC> Back in 2010, cygwin adopted upstream changes that caused
GC> 'grep' behavior to diverge from ASCII, so I followed the
GC> recommendation here:
GC>   https://www.cygwin.com/ml/cygwin/2010-12/msg00088.html
GC> and set
GC>   LC_COLLATE=C.UTF-8
GC> so that [a-z] means {abcdefghijklmnopqrstuvwxyz}.

 I'm not sure what is the reason for setting just LC_COLLATE and not
LC_ALL here?

GC> Now I'm using a very recent version of cygwin that has this
GC> further upstream change:
GC>   https://www.cygwin.com/ml/cygwin/2014-12/msg00356.html
GC> due to which 'grep' now considers some lmi source files to be
GC> binary. For example:
GC>   grep -w SpecAmtLoad *.?pp
GC>   ...
GC>   Binary file ledger_xml_io.cpp matches

 Ah, interesting, I didn't see this yet.

GC> That 2014 message recommends:
GC>   LC_ALL=C grep
GC> which would override my LC_COLLATE workaround. Apparently I
GC> could just set
GC>   export LC_CTYPE=C
GC> instead, and keep my LC_COLLATE setting; does that seem silly?

 Again, maybe I'm just not being sophisticated enough here, but what's
wrong with "export LC_ALL=C" if you want to work with ASCII only? Of
course, "LC_ALL=C.UTF-8" is even better, but this does rely on having
everything in UTF-8.

 IMHO the best solution would be to just convert the few[*] lmi files using
non-ASCII characters to UTF-8 and set LC_ALL=C.UTF-8.

GC> This also works:
GC>   export LANG=C
GC> Is that sillier, or less silly?

 I wouldn't call this silly, but it doesn't make much sense to use C as a
fallback (which is what LANG does) to me, AFAICS you want to force the use
of C locale and from this point of view setting either LC_ALL (once and for
all) or individual LC_XXX would seem to make more sense.

GC> tasteful practice, and don't want to adopt something that turns
GC> out to be outlandish.

 I am not aware of any standard/best practice concerning setting LC_ALL vs
LANG. The only thing I know is that Linux distributions are supposed to set
LANG by default in their standard rc files and leave LC_XXX for the user to
override if he wishes, but this isn't really relevant here.

 To summarize, I think you just should set LC_ALL=C. But converting
everything to UTF-8 and using LC_ALL=C.UTF-8 would be the best IMHO.

 Regards,
VZ

[*] The few:

        % for f in *.?pp; (iconv -f utf8 -t utf8 $f &>/dev/null || echo $f)
        config.hpp
        global_settings.hpp
        interest_rates.cpp
        ledger_xml_io.cpp
        main_cli.cpp
        null_stream.cpp
        platform_dependent.hpp
        progress_meter.hpp
        round_test.cpp
        skeleton.cpp
        snprintf_test.cpp
        tn_range.hpp
        value_cast_test.cpp

reply via email to

[Prev in Thread] Current Thread [Next in Thread]