[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: seq - Suggestion: Define dots as standard decimal separator, using l
From: |
Pádraig Brady |
Subject: |
Re: seq - Suggestion: Define dots as standard decimal separator, using locale as optional |
Date: |
Sat, 2 Feb 2019 19:27:00 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 |
On 02/02/19 17:32, Pádraig Brady wrote:
> On 01/02/19 16:03, Felix Neuper wrote:
>> Hi,
>>
>>
>> Recently I stumbled upon seq's behaviour of using the floating point
>> separator as defined in the current locale.
>>
>> Regarding portability of scripts and standard practice in most data
>> processing environments, I would kindly suggest to define usage of dots
>> as standard behaviour and loading locale settings only when requested
>> via an option (e.g. -l, --locale).
>>
>> Alternatively one could allow the -f option to define the separator ( -f
>> %1.2f still gave commas for a German locale) or base the output on the
>> input format ( the input issue has been addressed before:
>> https://lists.gnu.org/archive/html/bug-coreutils/2014-02/msg00044.html ).
>>
>> Unfortunately the locale-dependency in seq's behaviour is also not
>> mentioned in any manual, making error tracking a hard time.
>
> There are many aspects of these utilities that are dependent on locale
> settings.
> Adding another way to control the locale would just confuse things at
> this stage IMHO. What you want is to set LC_NUMERIC=C when your script
> is dependent on that format.
>
>> Apart from that I also noticed odd behaviour with bad locale settings:
>> With LANG=en_US (erroneous) and LC_NUMERIC=de_DE.UTF-8, output format is
>> mixed in specific cases
>>
>> seq 0.1 0.2 1.3
>> 0.1
>> 0.3
>> 0.5
>> 0.7
>> 0.9
>> 1.1
>> 1,3
>>
>> (note the comma in the last line)
>
> Well that's a bug.
> The first set of numbers are output by printf(3) after:
> setlocale (LC_ALL, "")
> and the last one after
> setlocale (LC_NUMERIC, "")
>
> Now your first set of numbers should be outputting ',' as the decimal point.
> My glibc-2.24 system does at least. Can you give the output from the locale
> command so that we can double check the values of all env vars that might
> be significant here. Also it would be useful to show the specific values for
> these env vars:
> LANGUAGE, LC_ALL, LC_NUMERIC, LANG
>
> It sounds like on your system that LANG takes precedence in the first case,
> but not in the second. That's a bug (that we might be able to work around
> if deemed widespread enough). I know also that OpenBSD can only set some
> locales
> from LC_ALL, so perhaps doing an explicit setlocale (LC_NUMERIC, "") at
> startup
> is appropriate to handle these systems.
>
> For the record, here's the setlocale output on my system:
>
> $ LANG=en_US LC_NUMERIC=de_DE.UTF-8 ltrace -a40 -e setlocale src/seq 0.1 0.2
> 1.3 >/dev/null
> seq->setlocale(LC_ALL, "") = "LC_CTYPE=en_US;LC_NUMERIC=de_DE."...
> seq->setlocale(LC_NUMERIC, "C") = "C"
> seq->setlocale(LC_NUMERIC, "") = "de_DE.UTF-8"
Ah I see. en_US isn't valid at all on your system.
By setting an invalid LANG I was able to repro,
and the attached should address this inconsistency.
cheers,
Pádraig
seq-locale-point.patch
Description: Text Data