bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: behaviour in regex comparison


From: arnold
Subject: Re: behaviour in regex comparison
Date: Thu, 16 Nov 2023 04:55:44 -0700
User-agent: Heirloom mailx 12.5 7/5/10

Hi.

A patch is attached. Thanks for the report.  Apologies if this
gets sent out twice.

Arnold

"*" <cl2ap0101@gmail.com> wrote:

> Configuration Information [Automatically generated, do not change]:
> Machine: x86_64
> OS: linux-gnu
> Compiler: gcc
> Compilation CFLAGS: -g -O2 -DNDEBUG
> uname output: Linux orange 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4
> 18:03:25 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> Machine Type: x86_64-pc-linux-gnu
>
> Gawk Version: 5.3.0
>
> Attestation 1:
> I have read https://www.gnu.org/software/gawk/manual/html_node/Bugs.html.
> Yes
>
> Attestation 2:
> I have not modified the sources before building gawk.
> True
>
> Description:
> I noticed a change in behaviour compared to previous versions (default
> system gawk is 5.1.1, compared to 5.3.0) when comparing regex. In previous
> versions - if I understand correctly - regex where compared as string,
> following the rules described in the manual for strings comparison. With
> version 5.3.0 it seems variable which have regex type compares always equal
> (using equality operators `==` or `<=` or `>=` ) _but_ always unequal when
> using `<` or `>`.
> I know using these operators with variables typed as regex may be
> inappropriate and, indeed, I don't find any reference about that in the
> manual... but, I noticed this change and I asked myself if it's been made
> on purpose (which would make perfect sense, btw).
> Repeat-By:
> crap0101@orange:~/test$ cat awk_re.awk
> BEGIN {
> x=@/bar/
> y[0]=@/bar/
> y[1]=@/baz/
> y[2]="bar"
> y[3]="baz"
> printf("* set: x=@/%s/\n", x)
> for (i in y) {
>            yfmt = typeof(y[i]) == "regexp" ? sprintf("@/%s/", y[i]) : ""y[i]
>            printf("* set: y=%s\n", yfmt)
>            printf("@/%s/ == %s --> %d\n", x, yfmt, x == y[i])
>            printf("@/%s/ ~  %s --> %d\n", x, yfmt, x ~ y[i])
>            printf("@/%s/ <= %s --> %d\n", x, yfmt, x <= y[i])
>            printf("@/%s/ <  %s --> %d\n", x, yfmt, x < y[i])
> }
> }
> crap0101@orange:~/test$ awk -f awk_re.awk
> * set: x=@/bar/
> * set: y=@/bar/
> @/bar/ == @/bar/ --> 1
> @/bar/ ~  @/bar/ --> 1
> @/bar/ <= @/bar/ --> 1
> @/bar/ <  @/bar/ --> 0
> * set: y=@/baz/
> @/bar/ == @/baz/ --> 0
> @/bar/ ~  @/baz/ --> 0
> @/bar/ <= @/baz/ --> 1
> @/bar/ <  @/baz/ --> 1
> * set: y=bar
> @/bar/ == bar --> 1
> @/bar/ ~  bar --> 1
> @/bar/ <= bar --> 1
> @/bar/ <  bar --> 0
> * set: y=baz
> @/bar/ == baz --> 0
> @/bar/ ~  baz --> 0
> @/bar/ <= baz --> 1
> @/bar/ <  baz --> 1
> crap0101@orange:~/test$ AWK/gawk/gawk -f awk_re.awk
> * set: x=@/bar/
> * set: y=@/bar/
> @/bar/ == @/bar/ --> 1
> @/bar/ ~  @/bar/ --> 1
> @/bar/ <= @/bar/ --> 1
> @/bar/ <  @/bar/ --> 0
> * set: y=@/baz/
> @/bar/ == @/baz/ --> 1
> @/bar/ ~  @/baz/ --> 0
> @/bar/ <= @/baz/ --> 1
> @/bar/ <  @/baz/ --> 0
> * set: y=bar
> @/bar/ == bar --> 1
> @/bar/ ~  bar --> 1
> @/bar/ <= bar --> 1
> @/bar/ <  bar --> 0
> * set: y=baz
> @/bar/ == baz --> 0
> @/bar/ ~  baz --> 0
> @/bar/ <= baz --> 1
> @/bar/ <  baz --> 1
> crap0101@orange:~/test$ awk -f awk_re.awk > /tmp/a1
> crap0101@orange:~/test$ AWK/gawk/gawk -f awk_re.awk > /tmp/a2
> crap0101@orange:~/test$ diff -Naur /tmp/a1 /tmp/a2
> --- /tmp/a1 2023-11-15 22:33:06.863658041 +0100
> +++ /tmp/a2 2023-11-15 22:33:14.399652253 +0100
> @@ -5,10 +5,10 @@
> @/bar/ <= @/bar/ --> 1
> @/bar/ <  @/bar/ --> 0
> * set: y=@/baz/
> -@/bar/ == @/baz/ --> 0
> +@/bar/ == @/baz/ --> 1
> @/bar/ ~  @/baz/ --> 0
> @/bar/ <= @/baz/ --> 1
> -@/bar/ <  @/baz/ --> 1
> +@/bar/ <  @/baz/ --> 0
> * set: y=bar
> @/bar/ == bar --> 1
> @/bar/ ~  bar --> 1
> crap0101@orange:~/test$ AWK/gawk/gawk --version | head -1
> GNU Awk 5.3.0, API 4.0, PMA Avon 8-g1
> crap0101@orange:~/test$ awk --version | head -1
> GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)
>
>
> Fix:
> As said, don't sure it's a bug...so don't sure needs a fix.
> The thing I found a bit confusing it's the `<=` vs `<` behaviour, but i
> don't know if there is an easy fix (nor if it's needed).

Attachment: regex-fix.diff
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]