emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Scan of regexp mistakes


From: Mattias Engdegård
Subject: Re: Scan of regexp mistakes
Date: Tue, 5 Mar 2019 16:06:16 +0100

5 mars 2019 kl. 03.04 skrev Paul Eggert <address@hidden>:
> 
> Thanks for reporting that. I fixed the glitches as best I could by
> applying the attached patch. I didn't see any false alarms, which is good.

Good work!

> It'd be nice if we could catch such typos on a regular basis. Is there
> some easy way to do that? A simple way might be for you to run your
> trawler once a month (say) and report back here. A nicer way would be
> for "make check" to run the trawler.

I can run it periodically but would surely forget. Should I put the trawler in 
the Emacs source tree (if so, where?), in ELPA, or elsewhere?

As a temporary measure, it now resides at https://github.com/mattiase/trawl, 
and it has some improvements which uncovered a few more nits. The error 
locations are now more precise, too.

About that massive change of yours: most of it were obvious, of course, but 
perhaps you could satisfy my curiosity:

diff --git a/lisp/arc-mode.el b/lisp/arc-mode.el
index 8de0103019..2afde7ee75 100644
--- a/lisp/arc-mode.el
+++ b/lisp/arc-mode.el
@@ -2016,7 +2016,7 @@ This doesn't recover lost files, it just undoes changes 
in the buffer itself."
       (call-process "lsar" nil t nil "-l" (or file copy))
       (if copy (delete-file copy))
       (goto-char (point-min))
-      (re-search-forward "^\\(\s+=+\s?+\\)+\n")
+      (re-search-forward "^\\(\s+=+\s+\\)+\n")
                                    ^^^
Are you sure this shouldn't be `\s*', which was the previous semantics, or 
`\s+?', in case it was a transposition mistake?
I suppose the \n at the end makes a non-greedy repetition unlikely.

diff --git a/lisp/gnus/gnus-art.el b/lisp/gnus/gnus-art.el
index 06f7be3da7..fa3abfac58 100644
--- a/lisp/gnus/gnus-art.el
+++ b/lisp/gnus/gnus-art.el
@@ -7378,7 +7378,7 @@ groups."
 
 ;; Regexp suggested by Felix Wiemann in <address@hidden>
 (defcustom gnus-button-valid-localpart-regexp
-  "[a-z0-9$%(*-=?[_][^<>\")!;:,{}\n\t @]*"
+  "[a-z$%(*-=?[_][^<>\")!;:,{}\n\t @]*"

You kept the rather odd range `*-=' which comprises `*+,-./0123456789:;<='. Is 
it supposed to be that way?

diff --git a/lisp/language/ethio-util.el b/lisp/language/ethio-util.el
index afc2239fbf..512d49b9c5 100644
--- a/lisp/language/ethio-util.el
+++ b/lisp/language/ethio-util.el
@@ -804,7 +804,7 @@ The 2nd and 3rd arguments BEGIN and END specify the region."
 
     ;; Special Ethiopic punctuation.
     (goto-char (point-min))
-    (while (re-search-forward "\\ce[»\\.\\?]\\|«\\ce" nil t)
+    (while (re-search-forward "\\ce[»\\.?]\\|«\\ce" nil t)

Should `\' really be kept in the set of characters? It looks like it was only 
included as an attempt to escape `.' and `?'.

diff --git a/lisp/net/goto-addr.el b/lisp/net/goto-addr.el
index 43659d2820..c25d787391 100644
--- a/lisp/net/goto-addr.el
+++ b/lisp/net/goto-addr.el
@@ -246,7 +246,7 @@ there, then load the URL at or before point."
   "Find e-mail address around or before point.
 Then search backwards to beginning of line for the start of an e-mail
 address.  If no e-mail address found, return nil."
-  (re-search-backward "address@hidden" (line-beginning-position) 'lim)
+  (re-search-backward "address@hidden" (line-beginning-position) 'lim)

This is good, but I should just point out that searching for A-z uncovers more 
suspect regexps, some of which aren't found by the trawler.

diff --git a/lisp/nxml/rng-uri.el b/lisp/nxml/rng-uri.el
index 0e458cfd2f..d8f2884f5e 100644
--- a/lisp/nxml/rng-uri.el
+++ b/lisp/nxml/rng-uri.el
@@ -42,7 +42,7 @@ escape them using %HH."
 
 (defun rng-uri-escape-multibyte (uri)
   "Escape multibyte characters in URI."
-  (replace-regexp-in-string "[:nonascii:]"
+  (replace-regexp-in-string "[[:nonascii:]]"

Lovely one! Here is another one in the same file (line 33), but that wasn't 
found by the trawler:

         (replace-regexp-in-string "[\000-\032\177<>#%\"{}|\\^[]`%?;]"

That \032 doesn't look right (number base confusion?), and it looks like it's 
meant as a single character alternative but it isn't, given the misplaced `]'.

diff --git a/lisp/org/org-mobile.el b/lisp/org/org-mobile.el
index 1ff6358403..83dcc7b0d1 100644
--- a/lisp/org/org-mobile.el
+++ b/lisp/org/org-mobile.el
@@ -845,11 +845,11 @@ If BEG and END are given, only do this in that region."
            (cl-incf cnt-error)
            (throw 'next t))
          (move-marker bos-marker (point))
-         (if (re-search-forward "^** Old value[ \t]*$" eos t)
+         (if (re-search-forward "^\\*\\* Old value[ \t]*$" eos t)

Shouldn't this start with "^\\**", or does it have to be exactly two asterisks?

              (setq old (buffer-substring
                         (1+ (match-end 0))
                         (progn (outline-next-heading) (point)))))
-         (if (re-search-forward "^** New value[ \t]*$" eos t)
+         (if (re-search-forward "^\\*\\* New value[ \t]*$" eos t)

Idem.

--- a/lisp/org/org.el
+++ b/lisp/org/org.el
@@ -10467,7 +10467,7 @@ This is still an experimental function, your mileage 
may vary."
    ((and (equal type "lisp") (string-match "^/" path))
     ;; Planner has a slash, we do not.
     (setq type "elisp" path (substring path 1)))
-   ((string-match "^//\\(.?*\\)/\\(<.*>\\)$" path)
+   ((string-match "^//\\(.*\\)/\\(<.*>\\)$" path)

Another repetition-of-repetition. Sure it shouldn't be `*?' instead? It looks 
likely, since there is a `/' following that would be eaten by the `.*' given 
half a chance.

diff --git a/lisp/progmodes/fortran.el b/lisp/progmodes/fortran.el
index be272c0922..c1a267f4c5 100644
--- a/lisp/progmodes/fortran.el
+++ b/lisp/progmodes/fortran.el
@@ -2052,7 +2052,7 @@ If ALL is nil, only match comments that start in column > 
0."
                 (when (<= (point) bos)
                   (move-to-column (1+ fill-column))
                   ;; What is this doing???
-                  (or (re-search-forward "[\t\n,'+-/*)=]" eol t)
+                  (or (re-search-forward "[-\t\n,'+./*)=]" eol t)

Where did the . come from? Don't you think that `+-/*' were meant to include 
those four symbols only?

diff --git a/lisp/progmodes/mixal-mode.el b/lisp/progmodes/mixal-mode.el
index 1ea4b33093..a759709b5c 100644
--- a/lisp/progmodes/mixal-mode.el
+++ b/lisp/progmodes/mixal-mode.el
@@ -1044,7 +1044,7 @@ EXECUTION-TIME holds info about the time it takes, number 
or string.")
      . mixal-font-lock-operation-code-face)
     (,(regexp-opt mixal-assembly-pseudoinstructions 'words)
      . mixal-font-lock-assembly-pseudoinstruction-face)
-    ("^[A-Z0-9a-z]*[ \t]+[A-ZO-9a-z]+[ \t]+\\(=.*=\\)"
+    ("^[A-Z0-9a-z]*[ \t]+[A-Z0-9a-z]+[ \t]+\\(=.*=\\)"

Another glorious regexp!

diff --git a/lisp/progmodes/verilog-mode.el b/lisp/progmodes/verilog-mode.el
index a949a461c1..e1003378b2 100644
--- a/lisp/progmodes/verilog-mode.el
+++ b/lisp/progmodes/verilog-mode.el
@@ -9357,7 +9357,7 @@ Returns REGEXP and list of ( (signal_name 
connection_name)... )."
              ;; Regexp form??
              ((looking-at
                ;; Regexp bug in XEmacs disallows ][ inside [], and wants + last
-               
"\\s-*\\.\\(\\(address@hidden|---]\\|[][]\\|\\\\[()|]\\)+\\)\\s-*(\\(.*\\))\\s-*\\(,\\|)\\s-*;\\)")
+               
"\\s-*\\.\\(\\(address@hidden|[][]\\|\\\\[()|]\\)+\\)\\s-*(\\(.*\\))\\s-*\\(,\\|)\\s-*;\\)")
               (setq rep (match-string-no-properties 3))
               (goto-char (match-end 0))
               (setq tpl-wild-list

Are you sure that | shouldn't be there too? Or is this some kind of XEmacs 
idiom?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]