[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: documentation around RE repetition metachars may need clarification
From: |
Andrew J. Schorr |
Subject: |
Re: documentation around RE repetition metachars may need clarification |
Date: |
Mon, 22 May 2023 11:06:35 -0400 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Hi,
On Sun, May 21, 2023 at 09:46:50AM -0500, Ed Morton wrote:
> In the gawk manual under
> https://www.gnu.org/software/gawk/manual/html_node/Regexp-Operator-Details.html
> we have this statement:
>
> >In POSIX |awk| and |gawk|, the ‘*’, ‘+’, and ‘?’ operators stand
> >for themselves when there is nothing in the regexp that precedes
> >them.
>
> while in the POSIX spec under
> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_03
> we have this statement:
>
> >*+?{
> > The <asterisk>, <plus-sign>, <question-mark>, and <left-brace>
> > shall be special except when used in a bracket expression (see RE
> > Bracket Expression
> >
> > <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05>).
> > Any of the following uses produce undefined results:
> >
> > *
> >
> > If these characters appear first in an ERE
> >
>
> So the gawk manual statement says that /+foo/ in any POSIX awk will
> match the literal string "+foo" while the POSIX spec statement says
> it's undefined behavior.
>
> Should the gawk manual be tweaked to clarify/explain what it
> currently says about POSIX awk since it apparently contradicts the
> POSIX spec?
Stupid question: when something says that the behavior is undefined, is
it not the case that a given implementation is entitled to make its
own choice about how to handle that situation? If so, why is gawk's
choosing to match "+foo" at odds with POSIX? If it's "undefined", do
you instead expect it to throw an error?
Regards,
Andy