bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51011: [PATCH] sort: --debug: add warnings about radix and grouping


From: Paul Eggert
Subject: bug#51011: [PATCH] sort: --debug: add warnings about radix and grouping chars
Date: Sun, 10 Oct 2021 16:34:06 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0

The warnings look good, except that this one:

   $ printf '1.0\n0.9\n' | sort -s -k1,1g --debug
   sort: numbers use ‘.’ as a decimal point in this locale
   0.9
   ___
   1.0
   ___

seems overkill if we're in the C locale.

Also, shouldn't similar diagnostics be generated if the field separator is '-', or '+', or a digit in the current locale?


+  if (numeric_field_span)
+    {
+      char sep_string[2] = { 0, };
+      sep_string[0] = thousands_sep;
+      if ((tab == TAB_DEFAULT
+           && (isblank (to_uchar (thousands_sep))))
+          || tab == thousands_sep)
+        {
+          error (0, 0,
+                 _("field separator %s is treated as a "
+                   "group separator in numbers"),
+                 quote (sep_string));
+          number_locale_warned = true;
+        }
+    }

This code brought it to my attention that the GNU 'sort' has had a longstanding bug (in code that I wrote long ago - sorry!) in that thousands_sep is either -1 or an unsigned char converted to int, and this doesn't work in some unusual cases. I installed the attached patch to fix that bug, and I vaguely suspect that it fixes similar bugs in GNU 'test' and GNU 'expr'. Good thing you brought it to my attention. (Sorry, I'm too lazy and/or time-pressed and/or overconfident to write test cases....)

Anyway, with that patch installed, TAB and THOUSANDS_SEP can both be CHAR_MAX + 1 so the above code needs to be twiddled. Also, we can assume C99. So, something like following (pardon Thunderbird's line wrap):

  if (numeric_field_span
      && (tab == TAB_DEFAULT
          ? thousands_char != NON_CHAR && isblank (to_uchar (thousands_sep))
          : tab == thousands_sep))
    {
      error (0, 0,
             _("field separator %s is treated as a group separator in numbers"),
             quote (((char []) {thousands_sep, 0})));
      number_locale_warned = true;
    }

with a similar replacement to the decimal_point code.

Attachment: 0001-sort-fix-unlikely-bug-when-377-0.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]