bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Question] Is this a bug?


From: Sedapnya Tidur
Subject: Re: [Question] Is this a bug?
Date: Sun, 9 Jul 2023 20:15:20 +0800

Well, if I could just edit the initial message I would. So let me make it
clear here (hopefully).

[Question1]
How do I use "\x5B" in this case? Is it even possible? Do I have no other
choice than using "\["?

$ gawk 'BEGIN { print match("a[", /^[^[]\x5B/) }'
gawk: cmd. line:1: error: Invalid regular expression: /^[^[]\/
$ gawk 'BEGIN { print match("a[", /^[^[]\\x5B/) }'
0
$ gawk 'BEGIN { print match("a[", /^[^[]\\\x5B/) }'
gawk: cmd. line:1: error: Invalid regular expression: /^[^[]\\\/
$ gawk 'BEGIN { print match("a[", /^[^[]\\\\x5B/) }'
0
$ gawk 'BEGIN { print match("a[", /^[^[]\\\\\x5B/) }'
gawk: cmd. line:1: error: Invalid regular expression: /^[^[]\\\\\/
$ gawk 'BEGIN { print match("a[", /^[^[]\\\\\\x5B/) }'
0

[Question 2]
Which one is correct?
$ gawk 'BEGIN { print match("", "\\\x27") }'
1
$ (b|m|n)awk 'BEGIN { print match("", "\\\x27") }'
0

*b for busybox awk.

[Question3]
This one I am sure if it is a bug or my stupidity in general.
$ gawk 'BEGIN { str=".\"a\\\"\".\"b\""; match(str,
/^.("(\\.|[^"])*"|[^".])*/, map); print substr(str,RSTART,RLENGTH); print
""; for(key in map) print key, map[key] }'
."a\""."b

0start 1
0length 9
1start 9
1length 1
2start 7
2length 1

I would expect something like this:
."a\""

0start 1
0length 6
1start 2
1length 5

Thanks.

On Sun, Jul 9, 2023, 3:47 PM <arnold@skeeve.com> wrote:

> This is exactly the right answer.
>
> Much thanks,
>
> Arnold
>
> Wolfgang Laun <wolfgang.laun@gmail.com> wrote:
>
> > grep with -P mimics Perl down to the least detail, i.e., the way Perl
> > parses any input text. Thus, '\x5B' is not the same as '[' but is treated
> > as '\[", an escaped bracket. Deep in the Perl 5 documentation on
> backslash
> > in regular expressions you can find this paragraph:  *Note that a
> character
> > expressed as one of these* [hexadhecimal] *escapes is considered a
> > character without special meaning by the regex engine, and will match "as
> > is". *(There is a similar paragraph on octal escapes.)
> >
> > (g)awk processes string literals and literal regular expressions as most
> > compilers do, converting hexadecimal escapes to characters. Therefore,
> > "\x5B" becomes "[" and is indistinguishable from a "[" in the input.
> >
> > Wolfgang
> >
> >
> > On Fri, 7 Jul 2023 at 22:37, Sedapnya Tidur <sedapnyatidur@gmail.com>
> wrote:
> >
> > > $ gawk 'BEGIN { print match("a[", /^[^[]\x5B/) }'
> > > gawk: cmd. line:1: error: Invalid regular expression: /^[^[]\/
> > >
> > > $ gawk -V
> > > GNU Awk 5.2.2, API 3.2, (GNU MPFR 4.2.0-p9, GNU MP 6.2.1)
> > >
> > > $ grep -Po --color '^[^[]\x5B' <<< 'a[xxx'
> > > a[
> > >
> >
> >
> > --
> > Wolfgang Laun
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]