|
From: | Paul Eggert |
Subject: | bug#51011: [GNU sort] Numerical sort with delimiter may be broken (bug) |
Date: | Sat, 9 Oct 2021 15:29:02 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 |
On 10/9/21 5:00 AM, Pádraig Brady wrote:
On 09/10/2021 04:48, Paul Eggert wrote:
'sort' could determine the group sizes from the locale, and reject digit strings that are formatted improperly according to the group-size rules. (Not that I plan to write the code to do that....)Yes I agree that would be better, but not worth it I think as there would still be ambiguity in what was a grouping char and what was a field separator. Also that ambiguity would now vary across locales.
I don't see the ambiguity problem. The field separator is used to identify fields; once the fields are identified, the thousands separator, decimal point, etc. contribute to numeric comparison in the usual way. So it's OK (albeit confusing) for the field separator to be '.' or ',' or '-' or '0' or any another character that could be part of a number.
For example, with 'sort -t 0 -k 2,2n', the digit 0 is not part of the numeric field that is compared, and there's no ambiguity about that even though 0 is allowed in numbers. The same idea applies to 'sort -t , -k 2,2n'.
[Prev in Thread] | Current Thread | [Next in Thread] |