bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses sa


From: Joseph Turner
Subject: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
Date: Sat, 09 Dec 2023 10:50:16 -0800

Hi João!  Thanks for your patience - preparing for EmacsConf was a blast,
and now I'm on a plane to go visit my grandmother!

João Távora <joaotavora@gmail.com> writes:

> Hi all,
>
> I've been working on all these shorthand-related issues over the last
> two days and I have reasonably short fixes for all of them.
>
> For this particular issue (bug#67309), I've opted to
> use Joseph's patch with very slight adjustments, as it's the
> only one that guarantees correct behaviour and doesn't seem
> to impact performance.
>
> The other issues are:
>
> bug#63480 (loaddefs-gen.el doesn't know about shorthands)
> bug#67325 (prefix discovery i.e. register-definition-prefixes)
> bug#67523 (check-declare.el doesn't know about shorthands)
>
> I have all this in 6 commits in the bugfix/shorthand-fixes branch.
>
> Here's the full patch minus whitespace changes.  If there are
> no comments I'll push in a few days' time.
>
> João
>
> diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
> index 1f3b677d7fb..18e80311177 100644
> --- a/doc/lispref/symbols.texi
> +++ b/doc/lispref/symbols.texi
> @@ -761,6 +761,23 @@ Shorthands
>  ;; End:
>  @end example
>
> +Note that if you have two shorthands in the same file where one is the
> +prefix of the other, the longer shorthand will be attempted first.
> +This happens regardless of the order you specify shorthands in the
> +local variables section of your file.
> +
> +@example
> +'(
> +  t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo'
> +  t/foo  ; reads to 'my-tricks-foo'
> +  )
> +
> +;; Local Variables:
> +;; read-symbol-shorthands: (("t/" . "my-tricks-")
> +;;                          ("t//" . "my-tricks--")
> +;; End:
> +@end example
> +
>  @subsection Exceptions

Clear and concise.

>  There are two exceptions to rules governing Shorthand transformations:
> diff --git a/lisp/emacs-lisp/check-declare.el 
> b/lisp/emacs-lisp/check-declare.el
> index c887d95210c..b19aedf314d 100644
> --- a/lisp/emacs-lisp/check-declare.el
> +++ b/lisp/emacs-lisp/check-declare.el
> @@ -145,21 +145,26 @@ check-declare-verify
>      (if (file-regular-p fnfile)
>          (with-temp-buffer
>            (insert-file-contents fnfile)
> +          (unless cflag
> +            ;; If in Elisp, ensure syntax and shorthands available
> +            (set-syntax-table emacs-lisp-mode-syntax-table)
> +            (let (enable-local-variables) (hack-local-variables)))
>            ;; defsubst's don't _have_ to be known at compile time.
> -          (setq re (format (if cflag
> -                               "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
> +          (setq re (if cflag
> +                       (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
> +                               (regexp-opt (mapcar 'cadr fnlist) t))
>                       "^[ \t]*(\\(fset[ \t]+'\\|\
>  cl-def\\(?:generic\\|method\\|un\\)\\|\
>  def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
>  ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
>  \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
>  ine-overloadable-function\\)\\)\
> -[ \t]*%s\\([ \t;]+\\|$\\)")
> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))

Would you explain what this regexp is intended to match?

>            (while (re-search-forward re nil t)
>              (skip-chars-forward " \t\n")
> -            (setq fn (match-string 2)
> -                  type (match-string 1)
> +            (setq fn (symbol-name (car (read-from-string (match-string 2)))))
> +            (when (member fn (mapcar 'cadr fnlist))
> +              (setq type (match-string 1)
>                      ;; (min . max) for a fixed number of arguments, or
>                      ;; arglists with optional elements.
>                      ;; (min) for arglists with &rest.
> @@ -202,7 +207,7 @@ check-declare-verify
>                                (t
>                                 'err))
>                      ;; alist of functions and arglist signatures.
> -                  siglist (cons (cons fn sig) siglist)))))
> +                    siglist (cons (cons fn sig) siglist))))))
>      (dolist (e fnlist)
>        (setq arglist (nth 2 e)
>              type

On my machine, this patch removes some of the check-declare "function
not found" errors, but not all.  For example, with hyperdrive-lib.el:

(check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")

Before this patch, the "*Check Declarations Warnings*" buffer shows:

--8<---------------cut here---------------start------------->8---
 ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
    ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
 ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
    in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
    found
 ■ hyperdrive-lib.el:1283:Warning (check-declare): said ‘h/org--link-goto’ was
    defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-org.el: function
    not found
 ■ hyperdrive-lib.el:45:Warning (check-declare): said ‘h/dir-mode’ was defined
    in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
 ■ hyperdrive-lib.el:1069:Warning (check-declare): said
    ‘h/dir--entry-at-point’ was defined in
    ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
 ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
    defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
    not found
--8<---------------cut here---------------end--------------->8---

and after your patch:

--8<---------------cut here---------------start------------->8---
 ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
    ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
 ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
    in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
    found
 ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
    defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
    not found
--8<---------------cut here---------------end--------------->8---

Are you able to reproduce this on your machine?

> diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
> index 04bea4723a2..e8093200bec 100644
> --- a/lisp/emacs-lisp/loaddefs-gen.el
> +++ b/lisp/emacs-lisp/loaddefs-gen.el
> @@ -378,6 +378,7 @@ loaddefs-generate--parse-file
>    (let ((defs nil)
>          (load-name (loaddefs-generate--file-load-name file main-outfile))
>          (compute-prefixes t)
> +        read-symbol-shorthands
>          local-outfile inhibit-autoloads)
>      (with-temp-buffer
>        (insert-file-contents file)
> @@ -399,7 +400,19 @@ loaddefs-generate--parse-file
>              (setq inhibit-autoloads (read (current-buffer)))))
>          (save-excursion
>            (when (re-search-forward "autoload-compute-prefixes: *" nil t)
> -            (setq compute-prefixes (read (current-buffer))))))
> +            (setq compute-prefixes (read (current-buffer)))))
> +        (save-excursion
> +          ;; since we're "open-coding" we have to repeat more
> +          ;; complicated logic in `hack-local-variables'.
> +          (when (re-search-forward "read-symbol-shorthands: *" nil t)
> +            (let* ((commentless (replace-regexp-in-string
> +                                 "\n\\s-*;+" ""
> +                                 (buffer-substring (point) (point-max))))
> +                   (unsorted-shorthands (car (read-from-string 
> commentless))))
> +              (setq read-symbol-shorthands
> +                    (sort unsorted-shorthands
> +                          (lambda (sh1 sh2)
> +                            (> (length (car sh1)) (length (car sh2))))))))))

IIUC, the intention here is to jump to a final "Local Variables"
declaration at the end of the file, then remove ";;", then read in the
uncommented value of `read-symbol-shorthands'.

Since `read-from-string' just reads one expression, the above hunk works
when there are more local variables after read-symbol-shorthands:

;; Local Variables:
;; read-symbol-shorthands: (("bc-" . "breadcrumb-"))
;; autoload-compute-prefixes: nil
;; End:

But if the read-symbol-shorthands declaration comes at the top, as in...

-*- read-symbol-shorthands: (("bc-" . "breadcrumb-")); -*-

...then this form will allocate two strings almost as long as the file.

Here's an alternative hack attempting to uncomment and read the minimum:

diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
index e8093200bec..406e4b28f1f 100644
--- a/lisp/emacs-lisp/loaddefs-gen.el
+++ b/lisp/emacs-lisp/loaddefs-gen.el
@@ -404,10 +404,13 @@ don't include."
         (save-excursion
           ;; since we're "open-coding" we have to repeat more
           ;; complicated logic in `hack-local-variables'.
-          (when (re-search-forward "read-symbol-shorthands: *" nil t)
-            (let* ((commentless (replace-regexp-in-string
+          (when-let ((beg
+                      (re-search-forward "read-symbol-shorthands: *" nil t)))
+            ;; `read-symbol-shorthands' alist ends with two parens.
+            (let* ((end (re-search-forward ")[;\n\s]*)"))
+                   (commentless (replace-regexp-in-string
                                  "\n\\s-*;+" ""
-                                 (buffer-substring (point) (point-max))))
+                                 (buffer-substring beg end)))
                    (unsorted-shorthands (car (read-from-string commentless))))
               (setq read-symbol-shorthands
                     (sort unsorted-shorthands

>        ;; We always return the package version (even for pre-dumped
>        ;; files).
> @@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes
>      (while (re-search-forward
>              "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]" nil 
> t)
>        (unless (member (match-string 1) autoload-ignored-definitions)
> -        (let ((name (match-string-no-properties 2)))
> +        (let* ((name (match-string-no-properties 2))
> +               ;; Consider `read-symbol-shorthands'.
> +               (probe (let ((obarray (obarray-make)))
> +                        (car (read-from-string name)))))
> +          (setq name (symbol-name probe))
>            (when (save-excursion
>                    (goto-char (match-beginning 0))
>                    (or (bobp)
> diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
> index b0665a55695..69b562e3c7e 100644
> --- a/lisp/emacs-lisp/shorthands.el
> +++ b/lisp/emacs-lisp/shorthands.el
> @@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face
>    :version "28.1"
>    :group 'font-lock-faces)
>
> -(defun shorthands--mismatch-from-end (str1 str2)
> -  "Tell index of first mismatch in STR1 and STR2, from end.
> -The index is a valid 0-based index on STR1.  Returns nil if STR1
> -equals STR2.  Return 0 if STR1 is a suffix of STR2."
> -  (cl-loop with l1 = (length str1) with l2 = (length str2)
> -           for i from 1
> -           for i1 = (- l1 i) for i2 = (- l2 i)
> -           while (eq (aref str1 i1) (aref str2 i2))
> -           if (zerop i2) return (if (zerop i1) nil i1)
> -           if (zerop i1) return 0
> -           finally (return i1)))
> -
>  (defun shorthands-font-lock-shorthands (limit)
> +  "Font lock until LIMIT considering `read-symbol-shorthands'."
>    (when read-symbol-shorthands
>      (while (re-search-forward
>              (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>")
>              limit t)
>        (let* ((existing (get-text-property (match-beginning 1) 'face))
> +             (print-name (match-string 1))
>               (probe (and (not (memq existing '(font-lock-comment-face
>                                                 font-lock-string-face)))
> -                         (intern-soft (match-string 1))))
> -             (sname (and probe (symbol-name probe)))
> -             (mismatch (and sname (shorthands--mismatch-from-end
> -                                   (match-string 1) sname)))
> -             (guess (and mismatch (1+ mismatch))))
> -        (when guess
> -          (when (and (< guess (1- (length (match-string 1))))
> -                     ;; In bug#67390 we allow other separators
> -                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
> -            (setq guess (1+ guess)))
> +                         (intern-soft print-name)))
> +             (symbol-name (and probe (symbol-name probe)))
> +             (prefix (and symbol-name
> +                          (not (string-equal print-name symbol-name))
> +                          (car (assoc print-name
> +                                      read-symbol-shorthands
> +                                      #'string-prefix-p)))))
> +        (when prefix
>            (add-face-text-property (match-beginning 1)
> -                                  (+ (match-beginning 1) guess)
> +                                  (+ (match-beginning 1) (length prefix))
>                                    'elisp-shorthand-font-lock-face))))))

Works well.  let-binding `symbol-name' and `print-name' is good improvement.

>  (font-lock-add-keywords 'emacs-lisp-mode
> '((shorthands-font-lock-shorthands)) t)
> diff --git a/lisp/files.el b/lisp/files.el
> index 1cdcec23b11..b266d0727ec 100644
> --- a/lisp/files.el
> +++ b/lisp/files.el
> @@ -3735,7 +3735,8 @@ before-hack-local-variables-hook
>  This hook is called only if there is at least one file-local
>  variable to set.")
>
> -(defvar permanently-enabled-local-variables '(lexical-binding)
> +(defvar permanently-enabled-local-variables
> +  '(lexical-binding read-symbol-shorthands)
>    "A list of file-local variables that are always enabled.
>  This overrides any `enable-local-variables' setting.")
>
> @@ -4171,6 +4172,13 @@ hack-local-variables--find-variables
>                                  ;; to use 'thisbuf's name in the
>                                  ;; warning message.
>                                  (or (buffer-file-name thisbuf) ""))))))
> +                          ((eq var 'read-symbol-shorthands)
> +                           ;; Sort automatically by shorthand length
> +                           ;; descending
> +                           (setq val (sort val
> +                                           (lambda (sh1 sh2) (>
> (length (car sh1))
> +
> (length (car sh2))))))
> +                           (push (cons 'read-symbol-shorthands val) result))
>                            ((and (eq var 'mode) handle-mode))
>     (t
>      (ignore-errors

Good catch.  I agree that longer shorthands should be applied first.

-----

A couple typo nits on the commit message of "Improve
shorthands-font-lock-shorthands (bug#67390)":

-  h//thingy     ; hilits "//"       reads to 'hyperdrive--thingy'
+  h//thingy     ; hilits "h//"       reads to 'hyperdrive--thingy'

-  Co-authored-by: João Távora <joaotavora@gmail.com>
+  Co-authored-by: Joseph Turner <joseph@breatheoutbreathe.in>


Thank you!

Joseph





reply via email to

[Prev in Thread] Current Thread [Next in Thread]