bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: is my loop issue a data conversion bug?


From: Wolfgang Laun
Subject: Re: is my loop issue a data conversion bug?
Date: Thu, 16 Jul 2020 12:44:29 +0200

A for statement

   *for*( *init*; condition; *increment* ) *body*;

should be considered as syntactic sugar for

   { init;
      *while*( condition ){
         *body*;
      }
      *increment;*
   }

This is the same with all languages where the syntax is derived from C; in
a lifetime of programming in C, awk, Perl 5 and Java I have used for loops
to make while loops more readable, i.e., with all of the control within
for(...) rather than splattered all around, resulting in for loops that do
not just count up or down.


Wolfgang

On Thu, 16 Jul 2020 at 06:43, Peter Lindgren <ogswd-awk@yahoo.com> wrote:

> I've been looking at various awk references, and I do see clear statements
> that comparisons are only done numerically if both sides are numeric, as in
> TAPL page 44:
>
> "In a comparison expression like
>     x == y
> if both operands have a numeric type, the comparison is numeric;
> otherwise, any numeric operand is coerced to string and the comparison is
> made on the string values."
>
> Or, per TAPL page 45:
>
> "Thus, to force a string comparison between two fields, coerce one field
> to string:
>     $1 " " == $2
>
> To force a numeric comparison, coerce BOTH fields to numeric:
>     $1 + 0 == $2 + 0"
>
> However I also see various more general statements about conversion being
> done "in context", as in in "Effective awk Programming", page 84:
>
> "Strings are converted to numbers and numbers are converted to strings, if
> the context of the awk program demands it."
>
> or in TAPL, page 35:
>
>  "A variable has a value that is a string or a number or both. Since the
> type of a variable is not declared, awk infers the type from context. When
> necessary, awk will convert a string value into a numeric one, or vice
> versa."
>
> (Brief pause as correspondent hesitates before contending with people much
> more likely to have served on POSIX committees than himself...)
>
> If a classic for loop isn't a context demanding numeric conversion, what
> is?
>
> See "Effective awk Programming", page 113, where the for statement is
> first described:
>
>     for (initialisation; condition; increment)
>         body
>
> The succeeding text hints that you might possibly do something other than
> numeric operations here, but acknowledges that you wouldn't typically do
> that. I (just now) made up the following for loop using strings:
>
>     for (x="a"; length(x)<35; x = x "a")
>         body
>
> But that's the first time in a lifetime of awk programming that I even
> imagined doing so.
>
> In the overwhelming majority of cases, as in my demo program, where loop
> initialisations and increments are clearly numeric, why not coerce both
> sides of the comparison to numeric as well? Numbers and strings are
> supposed to be so mutable, and that seems like the behavior that most users
> would expect.
>
> So, why not do it that way? (Pesky rules and standards and consistency and
> tradition aside... ;-)
>
>
> I've set myself up for this, take your best shot...
>
>
> On Monday, July 13, 2020, 03:09:17 PM CDT, Davide Brini <dave_br@gmx.com>
> wrote:
>
>
>
>
>
> On Mon, 13 Jul 2020 18:11:37 +0000 (UTC), Peter Lindgren
> <ogswd-awk@yahoo.com> wrote:
>
> > I hesitate to report this as a bug - maybe it's just some expected
> > behavior I don't understand ("That's not a bug, its a feature!"). But
> > here goes anyway...
> >
> >[ snip]
>
> >
> > Run them both on the supplied test data file "lendemo.dat" and observe
> > the differences in the outputs. There are comments in the programs
> > highlighting the interesting bits.
>
>
> I wouldn't go so far as saying that it's a feature, but it's not a bug,
> just expected behavior.
>
> A simpler reproducer is:
>
> gawk 'BEGIN{for(i = 1; i <= "9"; i++) print i}'
>
> As explained in the documentation
> (https://www.gnu.org/software/gawk/manual/gawk.html#Variable-Typing), when
> an integer (i in the above example, i and j in your code) and a string ("9"
> in the above example, len in your code) are compared, the comparison is a
> string comparison, so since all numbers up to 89 are lesser than "9" when
> converted to string and compared using string comparison, that's what you
> get as output.
>
> If you're wondering why "len" is a string in your code, remember that it's
> an array key, and array keys are always string by definition (see
> https://www.gnu.org/software/gawk/manual/gawk.html#Numeric-Array-Subscripts
> ).
>
> HTH
>
> --
> D.
>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]