bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses sa


From: Joseph Turner
Subject: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
Date: Fri, 02 Feb 2024 23:10:00 -0800

Joseph Turner <joseph@ushin.org> writes:

> Hi João!  Thanks for your patience - preparing for EmacsConf was a blast,
> and now I'm on a plane to go visit my grandmother!
>
> João Távora <joaotavora@gmail.com> writes:
>
>> Hi all,
>>
>> I've been working on all these shorthand-related issues over the last
>> two days and I have reasonably short fixes for all of them.
>>
>> For this particular issue (bug#67309), I've opted to
>> use Joseph's patch with very slight adjustments, as it's the
>> only one that guarantees correct behaviour and doesn't seem
>> to impact performance.
>>
>> The other issues are:
>>
>> bug#63480 (loaddefs-gen.el doesn't know about shorthands)
>> bug#67325 (prefix discovery i.e. register-definition-prefixes)
>> bug#67523 (check-declare.el doesn't know about shorthands)
>>
>> I have all this in 6 commits in the bugfix/shorthand-fixes branch.
>>
>> Here's the full patch minus whitespace changes.  If there are
>> no comments I'll push in a few days' time.
>>
>> João
>>
>> diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
>> index 1f3b677d7fb..18e80311177 100644
>> --- a/doc/lispref/symbols.texi
>> +++ b/doc/lispref/symbols.texi
>> @@ -761,6 +761,23 @@ Shorthands
>>  ;; End:
>>  @end example
>>
>> +Note that if you have two shorthands in the same file where one is the
>> +prefix of the other, the longer shorthand will be attempted first.
>> +This happens regardless of the order you specify shorthands in the
>> +local variables section of your file.
>> +
>> +@example
>> +'(
>> +  t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo'
>> +  t/foo  ; reads to 'my-tricks-foo'
>> +  )
>> +
>> +;; Local Variables:
>> +;; read-symbol-shorthands: (("t/" . "my-tricks-")
>> +;;                          ("t//" . "my-tricks--")
>> +;; End:
>> +@end example
>> +
>>  @subsection Exceptions
>
> Clear and concise.
>
>>  There are two exceptions to rules governing Shorthand transformations:
>> diff --git a/lisp/emacs-lisp/check-declare.el 
>> b/lisp/emacs-lisp/check-declare.el
>> index c887d95210c..b19aedf314d 100644
>> --- a/lisp/emacs-lisp/check-declare.el
>> +++ b/lisp/emacs-lisp/check-declare.el
>> @@ -145,21 +145,26 @@ check-declare-verify
>>      (if (file-regular-p fnfile)
>>          (with-temp-buffer
>>            (insert-file-contents fnfile)
>> +          (unless cflag
>> +            ;; If in Elisp, ensure syntax and shorthands available
>> +            (set-syntax-table emacs-lisp-mode-syntax-table)
>> +            (let (enable-local-variables) (hack-local-variables)))
>>            ;; defsubst's don't _have_ to be known at compile time.
>> -          (setq re (format (if cflag
>> -                               "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
>> +          (setq re (if cflag
>> +                       (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
>> +                               (regexp-opt (mapcar 'cadr fnlist) t))
>>                       "^[ \t]*(\\(fset[ \t]+'\\|\
>>  cl-def\\(?:generic\\|method\\|un\\)\\|\
>>  def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
>>  ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
>>  \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
>>  ine-overloadable-function\\)\\)\
>> -[ \t]*%s\\([ \t;]+\\|$\\)")
>> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
>> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
>
> Would you explain what this regexp is intended to match?
>
>>            (while (re-search-forward re nil t)
>>              (skip-chars-forward " \t\n")
>> -            (setq fn (match-string 2)
>> -                  type (match-string 1)
>> +            (setq fn (symbol-name (car (read-from-string (match-string 
>> 2)))))
>> +            (when (member fn (mapcar 'cadr fnlist))
>> +              (setq type (match-string 1)
>>                      ;; (min . max) for a fixed number of arguments, or
>>                      ;; arglists with optional elements.
>>                      ;; (min) for arglists with &rest.
>> @@ -202,7 +207,7 @@ check-declare-verify
>>                                (t
>>                                 'err))
>>                      ;; alist of functions and arglist signatures.
>> -                  siglist (cons (cons fn sig) siglist)))))
>> +                    siglist (cons (cons fn sig) siglist))))))
>>      (dolist (e fnlist)
>>        (setq arglist (nth 2 e)
>>              type
>
> On my machine, this patch removes some of the check-declare "function
> not found" errors, but not all.  For example, with hyperdrive-lib.el:
>
> (check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")
>
> Before this patch, the "*Check Declarations Warnings*" buffer shows:
>
> --8<---------------cut here---------------start------------->8---
>  ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
>     ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
>  ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
>     in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
>     found
>  ■ hyperdrive-lib.el:1283:Warning (check-declare): said ‘h/org--link-goto’ was
>     defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-org.el: function
>     not found
>  ■ hyperdrive-lib.el:45:Warning (check-declare): said ‘h/dir-mode’ was defined
>     in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
>  ■ hyperdrive-lib.el:1069:Warning (check-declare): said
>     ‘h/dir--entry-at-point’ was defined in
>     ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
>  ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
>     defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
>     not found
> --8<---------------cut here---------------end--------------->8---
>
>
> and after your patch:
>
> --8<---------------cut here---------------start------------->8---
>  ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
>     ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
>  ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
>     in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
>     found
>  ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
>     defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
>     not found
> --8<---------------cut here---------------end--------------->8---
>
> Are you able to reproduce this on your machine?
>
>> diff --git a/lisp/emacs-lisp/loaddefs-gen.el 
>> b/lisp/emacs-lisp/loaddefs-gen.el
>> index 04bea4723a2..e8093200bec 100644
>> --- a/lisp/emacs-lisp/loaddefs-gen.el
>> +++ b/lisp/emacs-lisp/loaddefs-gen.el
>> @@ -378,6 +378,7 @@ loaddefs-generate--parse-file
>>    (let ((defs nil)
>>          (load-name (loaddefs-generate--file-load-name file main-outfile))
>>          (compute-prefixes t)
>> +        read-symbol-shorthands
>>          local-outfile inhibit-autoloads)
>>      (with-temp-buffer
>>        (insert-file-contents file)
>> @@ -399,7 +400,19 @@ loaddefs-generate--parse-file
>>              (setq inhibit-autoloads (read (current-buffer)))))
>>          (save-excursion
>>            (when (re-search-forward "autoload-compute-prefixes: *" nil t)
>> -            (setq compute-prefixes (read (current-buffer))))))
>> +            (setq compute-prefixes (read (current-buffer)))))
>> +        (save-excursion
>> +          ;; since we're "open-coding" we have to repeat more
>> +          ;; complicated logic in `hack-local-variables'.
>> +          (when (re-search-forward "read-symbol-shorthands: *" nil t)
>> +            (let* ((commentless (replace-regexp-in-string
>> +                                 "\n\\s-*;+" ""
>> +                                 (buffer-substring (point) (point-max))))
>> +                   (unsorted-shorthands (car (read-from-string 
>> commentless))))
>> +              (setq read-symbol-shorthands
>> +                    (sort unsorted-shorthands
>> +                          (lambda (sh1 sh2)
>> +                            (> (length (car sh1)) (length (car sh2))))))))))
>
> IIUC, the intention here is to jump to a final "Local Variables"
> declaration at the end of the file, then remove ";;", then read in the
> uncommented value of `read-symbol-shorthands'.
>
> Since `read-from-string' just reads one expression, the above hunk works
> when there are more local variables after read-symbol-shorthands:
>
> ;; Local Variables:
> ;; read-symbol-shorthands: (("bc-" . "breadcrumb-"))
> ;; autoload-compute-prefixes: nil
> ;; End:
>
> But if the read-symbol-shorthands declaration comes at the top, as in...
>
> -*- read-symbol-shorthands: (("bc-" . "breadcrumb-")); -*-
>
> ...then this form will allocate two strings almost as long as the file.
>
> Here's an alternative hack attempting to uncomment and read the minimum:
>
> diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
> index e8093200bec..406e4b28f1f 100644
> --- a/lisp/emacs-lisp/loaddefs-gen.el
> +++ b/lisp/emacs-lisp/loaddefs-gen.el
> @@ -404,10 +404,13 @@ don't include."
>          (save-excursion
>            ;; since we're "open-coding" we have to repeat more
>            ;; complicated logic in `hack-local-variables'.
> -          (when (re-search-forward "read-symbol-shorthands: *" nil t)
> -            (let* ((commentless (replace-regexp-in-string
> +          (when-let ((beg
> +                      (re-search-forward "read-symbol-shorthands: *" nil t)))
> +            ;; `read-symbol-shorthands' alist ends with two parens.
> +            (let* ((end (re-search-forward ")[;\n\s]*)"))
> +                   (commentless (replace-regexp-in-string
>                                   "\n\\s-*;+" ""
> -                                 (buffer-substring (point) (point-max))))
> +                                 (buffer-substring beg end)))
>                     (unsorted-shorthands (car (read-from-string 
> commentless))))
>                (setq read-symbol-shorthands
>                      (sort unsorted-shorthands
>
>>        ;; We always return the package version (even for pre-dumped
>>        ;; files).
>> @@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes
>>      (while (re-search-forward
>>              "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]" 
>> nil t)
>>        (unless (member (match-string 1) autoload-ignored-definitions)
>> -        (let ((name (match-string-no-properties 2)))
>> +        (let* ((name (match-string-no-properties 2))
>> +               ;; Consider `read-symbol-shorthands'.
>> +               (probe (let ((obarray (obarray-make)))
>> +                        (car (read-from-string name)))))
>> +          (setq name (symbol-name probe))
>>            (when (save-excursion
>>                    (goto-char (match-beginning 0))
>>                    (or (bobp)
>> diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
>> index b0665a55695..69b562e3c7e 100644
>> --- a/lisp/emacs-lisp/shorthands.el
>> +++ b/lisp/emacs-lisp/shorthands.el
>> @@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face
>>    :version "28.1"
>>    :group 'font-lock-faces)
>>
>> -(defun shorthands--mismatch-from-end (str1 str2)
>> -  "Tell index of first mismatch in STR1 and STR2, from end.
>> -The index is a valid 0-based index on STR1.  Returns nil if STR1
>> -equals STR2.  Return 0 if STR1 is a suffix of STR2."
>> -  (cl-loop with l1 = (length str1) with l2 = (length str2)
>> -           for i from 1
>> -           for i1 = (- l1 i) for i2 = (- l2 i)
>> -           while (eq (aref str1 i1) (aref str2 i2))
>> -           if (zerop i2) return (if (zerop i1) nil i1)
>> -           if (zerop i1) return 0
>> -           finally (return i1)))
>> -
>>  (defun shorthands-font-lock-shorthands (limit)
>> +  "Font lock until LIMIT considering `read-symbol-shorthands'."
>>    (when read-symbol-shorthands
>>      (while (re-search-forward
>>              (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>")
>>              limit t)
>>        (let* ((existing (get-text-property (match-beginning 1) 'face))
>> +             (print-name (match-string 1))
>>               (probe (and (not (memq existing '(font-lock-comment-face
>>                                                 font-lock-string-face)))
>> -                         (intern-soft (match-string 1))))
>> -             (sname (and probe (symbol-name probe)))
>> -             (mismatch (and sname (shorthands--mismatch-from-end
>> -                                   (match-string 1) sname)))
>> -             (guess (and mismatch (1+ mismatch))))
>> -        (when guess
>> -          (when (and (< guess (1- (length (match-string 1))))
>> -                     ;; In bug#67390 we allow other separators
>> -                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
>> -            (setq guess (1+ guess)))
>> +                         (intern-soft print-name)))
>> +             (symbol-name (and probe (symbol-name probe)))
>> +             (prefix (and symbol-name
>> +                          (not (string-equal print-name symbol-name))
>> +                          (car (assoc print-name
>> +                                      read-symbol-shorthands
>> +                                      #'string-prefix-p)))))
>> +        (when prefix
>>            (add-face-text-property (match-beginning 1)
>> -                                  (+ (match-beginning 1) guess)
>> +                                  (+ (match-beginning 1) (length prefix))
>>                                    'elisp-shorthand-font-lock-face))))))
>
> Works well.  let-binding `symbol-name' and `print-name' is good improvement.
>
>>  (font-lock-add-keywords 'emacs-lisp-mode
>> '((shorthands-font-lock-shorthands)) t)
>> diff --git a/lisp/files.el b/lisp/files.el
>> index 1cdcec23b11..b266d0727ec 100644
>> --- a/lisp/files.el
>> +++ b/lisp/files.el
>> @@ -3735,7 +3735,8 @@ before-hack-local-variables-hook
>>  This hook is called only if there is at least one file-local
>>  variable to set.")
>>
>> -(defvar permanently-enabled-local-variables '(lexical-binding)
>> +(defvar permanently-enabled-local-variables
>> +  '(lexical-binding read-symbol-shorthands)
>>    "A list of file-local variables that are always enabled.
>>  This overrides any `enable-local-variables' setting.")
>>
>> @@ -4171,6 +4172,13 @@ hack-local-variables--find-variables
>>                                  ;; to use 'thisbuf's name in the
>>                                  ;; warning message.
>>                                  (or (buffer-file-name thisbuf) ""))))))
>> +                          ((eq var 'read-symbol-shorthands)
>> +                           ;; Sort automatically by shorthand length
>> +                           ;; descending
>> +                           (setq val (sort val
>> +                                           (lambda (sh1 sh2) (>
>> (length (car sh1))
>> +
>> (length (car sh2))))))
>> +                           (push (cons 'read-symbol-shorthands val) result))
>>                            ((and (eq var 'mode) handle-mode))
>>     (t
>>      (ignore-errors
>
> Good catch.  I agree that longer shorthands should be applied first.
>
> -----
>
> A couple typo nits on the commit message of "Improve
> shorthands-font-lock-shorthands (bug#67390)":
>
> -  h//thingy     ; hilits "//"       reads to 'hyperdrive--thingy'
> +  h//thingy     ; hilits "h//"       reads to 'hyperdrive--thingy'
>
> -  Co-authored-by: João Távora <joaotavora@gmail.com>
> +  Co-authored-by: Joseph Turner <joseph@breatheoutbreathe.in>
>
>
> Thank you!
>
> Joseph

Ping!





reply via email to

[Prev in Thread] Current Thread [Next in Thread]