[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Regexp Provoking an Error Message Contains \y but the Error Message
From: |
Neil R. Ormos |
Subject: |
Re: Regexp Provoking an Error Message Contains \y but the Error Message Displays \b |
Date: |
Fri, 30 Oct 2020 11:25:56 -0500 (CDT) |
arnold@skeeve.com wrote:
> "Neil R. Ormos" wrote:
>> The regexp in this short example lacks a closing right parenthesis.
>>> ~/.local/bin/gawk-5.1.0 'BEGIN{ if (a ~ /( \y / ) print "test"}'
>> gawk-5.1.0: cmd. line:1: error: Unmatched ( or \(: /( \b /
>> I have read the manual's explanation of why Gawk requires \y and not \b
>> in regexps.
>> Is it intended that Gawk display "\b" in the error message?
> This is sort of amusing, at least if you're a geek like me. :-)
> You've hit a dark corner in the implementation. When parsing a regexp
> and getting ready to compile the regexp, gawk turns \b into a literal
> backspace character, and it turns \y into the two characters \b so that
> the regexp routines will match a word boundary. (It has worked this
> way for decades.)
> The error message comes from the regexp routines, which actually
> saw '\' and 'b'.
> There's not a lot I can do about this...
Fair enough. Thank you for the explanation, which
confirmed my suspicion.
I was not aware of how much additional processing
of the error message was intended--e.g., whether
there was supposed to be a reciprocal translation
of \b back to the original \y, or perhaps
replacement of the "offending regexp" part of the
error message with a transcription of the original
regexp.
I am so used to seeing gawk's \y as equivalent as
equivalent to grep's \b that I probably would not
have noticed this at all had I not encountered it
with a more elaborate regexp and tried
unsuccessfully to find the offending line of code
by searching for what was quoted in the error
message.
Best regards,
--Neil