help-gnats
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 4.0 beta - question on parsing of subject line in PR header


From: Mel Hatzis
Subject: Re: 4.0 beta - question on parsing of subject line in PR header
Date: Sat, 11 May 2002 10:42:46 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020313


Lars Henriksen wrote:

On Fri, May 10, 2002 at 04:20:31PM -0700, Mel Hatzis wrote:

Can anyone tell me why the regular expression match for
the subject header was changed so that it no longer
supports subjects such as 'Re: category/num'?


Take a look at the list archives. There was a thorough discussion
of this in December, Subject: Subject line processing in Gnats 4.0.
A couple of your colleagues participated.

Yes...I consulted with a couple of my colleagues after sending out the
help request. Thank you.

After understanding this a little more, we determined that there
was definitely a bug here. The regular expression used is incorrect
...for one, it requires a '\<' as the start of the subject line in order to
match. It is also missing an escape character before the '|' and is
incorrectly anchored to the beginning of the subject line.

I have attached a patch. For reference, here's the regular expression
from the patch:

(.*[ \t:])?((PR[ \t/])\\|([-a-z0-9_+.]+/))([0-9]+)

The patch allows for the following subject
lines:

 Fwd: Re: category/50
 Re:<tab>category/50
 Re: PR 50
 Re PR/50
 Re:PR 50 (note that there's no space after the colon)
 category/50
 PR<tab>50
 PR/50
 PR 50

with as much preceding or trailing gunk as desirable...note that trailing
gunk need not be separated by white space.

I added the colon to the preceding text separator as an afterthought...
thinking it might be useful. Since the existing regex didn't allow for
categories with colon's in them, this seemed like a safe addition.

There are a few corner cases where this may not result in the most
desirable behaviour, such as:

  Re: PR 50 (was: Re: PR/75)

which matches PR/75.

However, for each such case, there's generally a counter argument...

  Re: closed PR 50 (fix documented in PR/75)

--
Mel Hatzis
Juniper Networks, Inc.
Index: file-pr.c
===================================================================
RCS file: /cvsroot/gnats/gnats/gnats/file-pr.c,v
retrieving revision 1.45
diff -b -u -p -r1.45 file-pr.c
--- file-pr.c   10 Feb 2002 18:23:42 -0000      1.45
+++ file-pr.c   11 May 2002 08:33:40 -0000
@@ -572,7 +572,7 @@ checkIfReply (PR *pr, ErrorDesc *err)
   const char *headerValue;
   struct re_pattern_buffer regex;
   struct re_registers regs;
-  int i, start, end, idstart;
+  int i, start, end, idstart, idend;
   char case_fold[256];
   char *possiblePrNum;
   reg_syntax_t old_syntax;
@@ -594,7 +594,7 @@ checkIfReply (PR *pr, ErrorDesc *err)
   regex.translate = case_fold;
   
   {
-    const char *const PAT = "\\<((PR[ \t/])|([-a-z0-9_+.]+)/)([0-9]+)";
+    const char *const PAT = "(.*[ \t:])?((PR[ 
\t/])\\|([-a-z0-9_+.]+/))([0-9]+)";
     re_compile_pattern (PAT, strlen (PAT), &regex);
   }
   i = re_match (&regex, headerValue, strlen (headerValue), 0, &regs);
@@ -607,9 +607,10 @@ checkIfReply (PR *pr, ErrorDesc *err)
       return NULL;
     }
 
-  start = regs.start[0];
-  end = regs.end[0];
-  idstart = regs.start[4] - start;
+  start = regs.start[2];
+  end = regs.end[2];
+  idstart = regs.start[5];
+  idend = regs.end[5];
 
   free (regs.start);
   free (regs.end);
@@ -618,7 +619,7 @@ checkIfReply (PR *pr, ErrorDesc *err)
   memcpy (possiblePrNum, headerValue + start, end - start);
   possiblePrNum[end - start] = '\0';
 
-  *(possiblePrNum + idstart - 1) = '\0';
+  *(possiblePrNum + end -start - 1) = '\0';
 
   /* See if the category exists: */
   cat = get_adm_record (CATEGORY (pr->database), possiblePrNum);
@@ -632,7 +633,9 @@ checkIfReply (PR *pr, ErrorDesc *err)
     {
       /* We only needed res, never cat, so free cat. */
       free_adm_entry (cat);
-      prID = xstrdup (possiblePrNum + idstart);
+      prID = xmalloc(idend - idstart + 1);
+      memcpy(prID, headerValue + idstart, idend - idstart);
+      *(prID + idend - idstart) = '\0';
     }
 
   free (possiblePrNum);

reply via email to

[Prev in Thread] Current Thread [Next in Thread]