bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Does gawk character classes follow this?


From: Wolfgang Laun
Subject: Re: [bug-gawk] Does gawk character classes follow this?
Date: Thu, 14 Feb 2019 13:55:04 +0100

On Thu, 14 Feb 2019 at 12:27, Peng Yu <address@hidden> wrote:

> I don't think the gawk manual is clear.
>
>
> https://www.gnu.org/software/gawk/manual/html_node/Bracket-Expressions.html#Bracket-Expressions
>
> Why not include the definition in the format similar to the following
> to make it unambiguous?
>
> [:alnum:]       [a-zA-Z0-9]
> [:alpha:]       [a-zA-Z]
> [:ascii:]       [\x00-\x7F]
> [:cntrl:]  [\x00-\x1F\x7F]
>

See the cited section in *The GNU Awk User's Guide *for an excellent reason
why this would not be a good idea. Quote: *"**Character classes* are a
feature introduced in the POSIX standard. A character class is a special
notation for describing lists of characters that have a specific attribute,
but the actual characters can vary from country to country and/or from
character set to character set. For example, the notion of what is an
alphabetic character differs between the United States and France."

>
> > Why don't you test it and see?
>
> Why don't you make the manual easy to read by making a table similar
> to the table in the "Character Classes" section in
> https://www.regular-expressions.info/posixbrackets.html


Why don't you provide a draft for this table, and we'll see whether it is
not only "easy to read" but also valid for any character set and locale
combination where gawk is supposed to be installable and runnable?


>
>
> What is the difference between gawk character classes and those
> mentioned in that table?
>

Are there any "gawk character classes"?


>
> On 2/14/19, address@hidden <address@hidden> wrote:
> > Hello.
> >
> > Peng Yu <address@hidden> wrote:
> >
> >> Hi,
> >>
> >> I'd like to make sure the definition of character classes in gawk. Can
> >> I use the following link as a reference for the definition?
> >>
> >> https://www.regular-expressions.info/posixbrackets.html
> >
> > No. Read The Fine Manual.
> >
> >> For example, does [:cntrl:] include \x7F?
> >
> > Why don't you test it and see?
> >
> >       gawk 'BEGIN { print("\x7F" ~ /[[:cntrl:]]/) }'
> >
> > Arnold
> >
>
>
> --
> Regards,
> Peng
>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]