|
From: | Pádraig Brady |
Subject: | bug#51011: [PATCH] sort: --debug: add warnings about radix and grouping chars |
Date: | Sun, 10 Oct 2021 18:57:57 +0100 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 |
On 09/10/2021 23:29, Paul Eggert wrote:
On 10/9/21 5:00 AM, Pádraig Brady wrote:On 09/10/2021 04:48, Paul Eggert wrote:'sort' could determine the group sizes from the locale, and reject digit strings that are formatted improperly according to the group-size rules. (Not that I plan to write the code to do that....)Yes I agree that would be better, but not worth it I think as there would still be ambiguity in what was a grouping char and what was a field separator. Also that ambiguity would now vary across locales.I don't see the ambiguity problem. The field separator is used to identify fields; once the fields are identified, the thousands separator, decimal point, etc. contribute to numeric comparison in the usual way. So it's OK (albeit confusing) for the field separator to be '.' or ',' or '-' or '0' or any another character that could be part of a number. For example, with 'sort -t 0 -k 2,2n', the digit 0 is not part of the numeric field that is compared, and there's no ambiguity about that even though 0 is allowed in numbers. The same idea applies to 'sort -t , -k 2,2n'.
Indeed. I dropped -t, from my later examples and confused myself. Attached is the proposed change to add appropriate warnings in this area. Examples now diagnosed are: $ printf '0,9\n1,a\n' | sort -nk1 --debug -t, -s sort: key 1 is numeric and spans multiple fields sort: field separator ‘,’ is treated as a group separator in numbers 1,a _ 0,9 ___ $ printf '1,a\n0,9\n' | LC_ALL=fr_FR.utf8 sort -gk1 --debug -t, -s sort: key 1 is numeric and spans multiple fields sort: field separator ‘,’ is treated as a decimal point in numbers 0,9 ___ 1,a __ $ printf '1.0\n0.9\n' | sort -s -k1,1g --debug sort: numbers use ‘.’ as a decimal point in this locale 0.9 ___ 1.0 ___ $ printf '1.0\n0.9\n' | LC_ALL=fr_FR.utf8 sort -s -k1,1g --debug sort: numbers use ‘,’ as a decimal point in this locale 0.9 _ 1.0 _ cheers, Pádraig
sort--debug-radix.patch
Description: Text Data
[Prev in Thread] | Current Thread | [Next in Thread] |