[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Question] Is this a bug?
From: |
Sedapnya Tidur |
Subject: |
Re: [Question] Is this a bug? |
Date: |
Sun, 9 Jul 2023 20:15:20 +0800 |
Well, if I could just edit the initial message I would. So let me make it
clear here (hopefully).
[Question1]
How do I use "\x5B" in this case? Is it even possible? Do I have no other
choice than using "\["?
$ gawk 'BEGIN { print match("a[", /^[^[]\x5B/) }'
gawk: cmd. line:1: error: Invalid regular expression: /^[^[]\/
$ gawk 'BEGIN { print match("a[", /^[^[]\\x5B/) }'
0
$ gawk 'BEGIN { print match("a[", /^[^[]\\\x5B/) }'
gawk: cmd. line:1: error: Invalid regular expression: /^[^[]\\\/
$ gawk 'BEGIN { print match("a[", /^[^[]\\\\x5B/) }'
0
$ gawk 'BEGIN { print match("a[", /^[^[]\\\\\x5B/) }'
gawk: cmd. line:1: error: Invalid regular expression: /^[^[]\\\\\/
$ gawk 'BEGIN { print match("a[", /^[^[]\\\\\\x5B/) }'
0
[Question 2]
Which one is correct?
$ gawk 'BEGIN { print match("", "\\\x27") }'
1
$ (b|m|n)awk 'BEGIN { print match("", "\\\x27") }'
0
*b for busybox awk.
[Question3]
This one I am sure if it is a bug or my stupidity in general.
$ gawk 'BEGIN { str=".\"a\\\"\".\"b\""; match(str,
/^.("(\\.|[^"])*"|[^".])*/, map); print substr(str,RSTART,RLENGTH); print
""; for(key in map) print key, map[key] }'
."a\""."b
0start 1
0length 9
1start 9
1length 1
2start 7
2length 1
I would expect something like this:
."a\""
0start 1
0length 6
1start 2
1length 5
Thanks.
On Sun, Jul 9, 2023, 3:47 PM <arnold@skeeve.com> wrote:
> This is exactly the right answer.
>
> Much thanks,
>
> Arnold
>
> Wolfgang Laun <wolfgang.laun@gmail.com> wrote:
>
> > grep with -P mimics Perl down to the least detail, i.e., the way Perl
> > parses any input text. Thus, '\x5B' is not the same as '[' but is treated
> > as '\[", an escaped bracket. Deep in the Perl 5 documentation on
> backslash
> > in regular expressions you can find this paragraph: *Note that a
> character
> > expressed as one of these* [hexadhecimal] *escapes is considered a
> > character without special meaning by the regex engine, and will match "as
> > is". *(There is a similar paragraph on octal escapes.)
> >
> > (g)awk processes string literals and literal regular expressions as most
> > compilers do, converting hexadecimal escapes to characters. Therefore,
> > "\x5B" becomes "[" and is indistinguishable from a "[" in the input.
> >
> > Wolfgang
> >
> >
> > On Fri, 7 Jul 2023 at 22:37, Sedapnya Tidur <sedapnyatidur@gmail.com>
> wrote:
> >
> > > $ gawk 'BEGIN { print match("a[", /^[^[]\x5B/) }'
> > > gawk: cmd. line:1: error: Invalid regular expression: /^[^[]\/
> > >
> > > $ gawk -V
> > > GNU Awk 5.2.2, API 3.2, (GNU MPFR 4.2.0-p9, GNU MP 6.2.1)
> > >
> > > $ grep -Po --color '^[^[]\x5B' <<< 'a[xxx'
> > > a[
> > >
> >
> >
> > --
> > Wolfgang Laun
>