[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Subject header matching--once again
From: |
Mark D. Baushke |
Subject: |
Re: Subject header matching--once again |
Date: |
Sun, 02 Mar 2003 12:46:47 -0800 |
Andrew J. Gray <address@hidden> writes:
> > In the meantime the existing Subject match code should be fixed to reflect
> > the agreement reached a year back, see my first mail in this thread:
> > http://mail.gnu.org/pipermail/help-gnats/2002-November/003185.html
> >
> > A patch follows that includes an update to the documentation. The
> > feature is mentioned a couple of times in passing in 'Keeping
> > Track'. I think it deserves a (sub)section of its own and have
> > inserted one called 'Following up via direct email' in the 'Editing
> > existing Problem Reports' section of 'The GNATS User Tools' chapter.
> > I have also corrected a couple of minor errors that I ran across.
>
> Thanks for that patch, I am sorry it has taken me so long to get to
> it.
>
> > The regular expression used for matching the Subject line appears in
> > the code as
> >
> > \\<(PR[ \t#/]?|([-A-Za-z0-9_+.]+)/)([0-9]+)
> >
> > whereas the documentation has
> >
> > \<(PR[ \t#/]?|[-\w+.]+/)[0-9]+
> >
> > I couldn't get the GNU match-word-constituent operator (\w) to work inside
> > the bracket expression and am uncertain as to whether it is allowed there.
> > Perl has it. The parentheses which are in the code, but missing from the
> > manual, do not affect the matching; they are there only to capture Category
> > and Number.
>
> As I understand it the match-word-constituent operator (\w) is not
> meant to work inside matching lists. I am looking at the "info"
> documentation included with the regex 0.12 (available from
> http://ftp.gnu.org/pub/gnu/regex/regex-0.12.tar.gz). In the "List
> Operators" node it says most characters lose any special meaning
> inside a list.
>
> I think the closest equivalent that works in a list is the alnum
> character class. Using this the regular expression would become:
>
> \\<(PR[ \t#/]?|([-[:alnum:]_+.]+)/)([0-9]+)
>
> Do you think this is a satisfactory replacement for \w?
\w is the same as [:alnum:]_ and does not really have "-" in the list,
but doesn't have or "." or "+" in it. That said, using
([-:[:alnum:]_+.]+) in the above would seem to match a category name
properly.
> > I haven't aligned the regular expression syntax with the rest of
> > GNATS as suggested by Milan. This is a non-issue as long as the
> > regular expression is hard-coded and not exposed for users to
> > modify. The regex searching is also case sensitive now.
>
> OK.
>
> > The patch is in production use in the GNATS installation that I am
> > responsible for. I hope it can make it into GNATS 4.0-beta2?
>
> Sorry that the patch missed the beta 2. Once we have decided whether
> or not to use the alnum character class I will commit the patch.
>
> --
> Andrew J. Gray
Enjoy!
-- Mark