[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses sa
From: |
Joseph Turner |
Subject: |
bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator |
Date: |
Fri, 02 Feb 2024 23:10:00 -0800 |
Joseph Turner <joseph@ushin.org> writes:
> Hi João! Thanks for your patience - preparing for EmacsConf was a blast,
> and now I'm on a plane to go visit my grandmother!
>
> João Távora <joaotavora@gmail.com> writes:
>
>> Hi all,
>>
>> I've been working on all these shorthand-related issues over the last
>> two days and I have reasonably short fixes for all of them.
>>
>> For this particular issue (bug#67309), I've opted to
>> use Joseph's patch with very slight adjustments, as it's the
>> only one that guarantees correct behaviour and doesn't seem
>> to impact performance.
>>
>> The other issues are:
>>
>> bug#63480 (loaddefs-gen.el doesn't know about shorthands)
>> bug#67325 (prefix discovery i.e. register-definition-prefixes)
>> bug#67523 (check-declare.el doesn't know about shorthands)
>>
>> I have all this in 6 commits in the bugfix/shorthand-fixes branch.
>>
>> Here's the full patch minus whitespace changes. If there are
>> no comments I'll push in a few days' time.
>>
>> João
>>
>> diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
>> index 1f3b677d7fb..18e80311177 100644
>> --- a/doc/lispref/symbols.texi
>> +++ b/doc/lispref/symbols.texi
>> @@ -761,6 +761,23 @@ Shorthands
>> ;; End:
>> @end example
>>
>> +Note that if you have two shorthands in the same file where one is the
>> +prefix of the other, the longer shorthand will be attempted first.
>> +This happens regardless of the order you specify shorthands in the
>> +local variables section of your file.
>> +
>> +@example
>> +'(
>> + t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo'
>> + t/foo ; reads to 'my-tricks-foo'
>> + )
>> +
>> +;; Local Variables:
>> +;; read-symbol-shorthands: (("t/" . "my-tricks-")
>> +;; ("t//" . "my-tricks--")
>> +;; End:
>> +@end example
>> +
>> @subsection Exceptions
>
> Clear and concise.
>
>> There are two exceptions to rules governing Shorthand transformations:
>> diff --git a/lisp/emacs-lisp/check-declare.el
>> b/lisp/emacs-lisp/check-declare.el
>> index c887d95210c..b19aedf314d 100644
>> --- a/lisp/emacs-lisp/check-declare.el
>> +++ b/lisp/emacs-lisp/check-declare.el
>> @@ -145,21 +145,26 @@ check-declare-verify
>> (if (file-regular-p fnfile)
>> (with-temp-buffer
>> (insert-file-contents fnfile)
>> + (unless cflag
>> + ;; If in Elisp, ensure syntax and shorthands available
>> + (set-syntax-table emacs-lisp-mode-syntax-table)
>> + (let (enable-local-variables) (hack-local-variables)))
>> ;; defsubst's don't _have_ to be known at compile time.
>> - (setq re (format (if cflag
>> - "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
>> + (setq re (if cflag
>> + (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
>> + (regexp-opt (mapcar 'cadr fnlist) t))
>> "^[ \t]*(\\(fset[ \t]+'\\|\
>> cl-def\\(?:generic\\|method\\|un\\)\\|\
>> def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
>> ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
>> \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
>> ine-overloadable-function\\)\\)\
>> -[ \t]*%s\\([ \t;]+\\|$\\)")
>> - (regexp-opt (mapcar 'cadr fnlist) t)))
>> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
>
> Would you explain what this regexp is intended to match?
>
>> (while (re-search-forward re nil t)
>> (skip-chars-forward " \t\n")
>> - (setq fn (match-string 2)
>> - type (match-string 1)
>> + (setq fn (symbol-name (car (read-from-string (match-string
>> 2)))))
>> + (when (member fn (mapcar 'cadr fnlist))
>> + (setq type (match-string 1)
>> ;; (min . max) for a fixed number of arguments, or
>> ;; arglists with optional elements.
>> ;; (min) for arglists with &rest.
>> @@ -202,7 +207,7 @@ check-declare-verify
>> (t
>> 'err))
>> ;; alist of functions and arglist signatures.
>> - siglist (cons (cons fn sig) siglist)))))
>> + siglist (cons (cons fn sig) siglist))))))
>> (dolist (e fnlist)
>> (setq arglist (nth 2 e)
>> type
>
> On my machine, this patch removes some of the check-declare "function
> not found" errors, but not all. For example, with hyperdrive-lib.el:
>
> (check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")
>
> Before this patch, the "*Check Declarations Warnings*" buffer shows:
>
> --8<---------------cut here---------------start------------->8---
> ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
> ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
> ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
> in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
> found
> ■ hyperdrive-lib.el:1283:Warning (check-declare): said ‘h/org--link-goto’ was
> defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-org.el: function
> not found
> ■ hyperdrive-lib.el:45:Warning (check-declare): said ‘h/dir-mode’ was defined
> in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
> ■ hyperdrive-lib.el:1069:Warning (check-declare): said
> ‘h/dir--entry-at-point’ was defined in
> ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
> ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
> defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
> not found
> --8<---------------cut here---------------end--------------->8---
>
>
> and after your patch:
>
> --8<---------------cut here---------------start------------->8---
> ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
> ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
> ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
> in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
> found
> ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
> defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
> not found
> --8<---------------cut here---------------end--------------->8---
>
> Are you able to reproduce this on your machine?
>
>> diff --git a/lisp/emacs-lisp/loaddefs-gen.el
>> b/lisp/emacs-lisp/loaddefs-gen.el
>> index 04bea4723a2..e8093200bec 100644
>> --- a/lisp/emacs-lisp/loaddefs-gen.el
>> +++ b/lisp/emacs-lisp/loaddefs-gen.el
>> @@ -378,6 +378,7 @@ loaddefs-generate--parse-file
>> (let ((defs nil)
>> (load-name (loaddefs-generate--file-load-name file main-outfile))
>> (compute-prefixes t)
>> + read-symbol-shorthands
>> local-outfile inhibit-autoloads)
>> (with-temp-buffer
>> (insert-file-contents file)
>> @@ -399,7 +400,19 @@ loaddefs-generate--parse-file
>> (setq inhibit-autoloads (read (current-buffer)))))
>> (save-excursion
>> (when (re-search-forward "autoload-compute-prefixes: *" nil t)
>> - (setq compute-prefixes (read (current-buffer))))))
>> + (setq compute-prefixes (read (current-buffer)))))
>> + (save-excursion
>> + ;; since we're "open-coding" we have to repeat more
>> + ;; complicated logic in `hack-local-variables'.
>> + (when (re-search-forward "read-symbol-shorthands: *" nil t)
>> + (let* ((commentless (replace-regexp-in-string
>> + "\n\\s-*;+" ""
>> + (buffer-substring (point) (point-max))))
>> + (unsorted-shorthands (car (read-from-string
>> commentless))))
>> + (setq read-symbol-shorthands
>> + (sort unsorted-shorthands
>> + (lambda (sh1 sh2)
>> + (> (length (car sh1)) (length (car sh2))))))))))
>
> IIUC, the intention here is to jump to a final "Local Variables"
> declaration at the end of the file, then remove ";;", then read in the
> uncommented value of `read-symbol-shorthands'.
>
> Since `read-from-string' just reads one expression, the above hunk works
> when there are more local variables after read-symbol-shorthands:
>
> ;; Local Variables:
> ;; read-symbol-shorthands: (("bc-" . "breadcrumb-"))
> ;; autoload-compute-prefixes: nil
> ;; End:
>
> But if the read-symbol-shorthands declaration comes at the top, as in...
>
> -*- read-symbol-shorthands: (("bc-" . "breadcrumb-")); -*-
>
> ...then this form will allocate two strings almost as long as the file.
>
> Here's an alternative hack attempting to uncomment and read the minimum:
>
> diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
> index e8093200bec..406e4b28f1f 100644
> --- a/lisp/emacs-lisp/loaddefs-gen.el
> +++ b/lisp/emacs-lisp/loaddefs-gen.el
> @@ -404,10 +404,13 @@ don't include."
> (save-excursion
> ;; since we're "open-coding" we have to repeat more
> ;; complicated logic in `hack-local-variables'.
> - (when (re-search-forward "read-symbol-shorthands: *" nil t)
> - (let* ((commentless (replace-regexp-in-string
> + (when-let ((beg
> + (re-search-forward "read-symbol-shorthands: *" nil t)))
> + ;; `read-symbol-shorthands' alist ends with two parens.
> + (let* ((end (re-search-forward ")[;\n\s]*)"))
> + (commentless (replace-regexp-in-string
> "\n\\s-*;+" ""
> - (buffer-substring (point) (point-max))))
> + (buffer-substring beg end)))
> (unsorted-shorthands (car (read-from-string
> commentless))))
> (setq read-symbol-shorthands
> (sort unsorted-shorthands
>
>> ;; We always return the package version (even for pre-dumped
>> ;; files).
>> @@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes
>> (while (re-search-forward
>> "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]"
>> nil t)
>> (unless (member (match-string 1) autoload-ignored-definitions)
>> - (let ((name (match-string-no-properties 2)))
>> + (let* ((name (match-string-no-properties 2))
>> + ;; Consider `read-symbol-shorthands'.
>> + (probe (let ((obarray (obarray-make)))
>> + (car (read-from-string name)))))
>> + (setq name (symbol-name probe))
>> (when (save-excursion
>> (goto-char (match-beginning 0))
>> (or (bobp)
>> diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
>> index b0665a55695..69b562e3c7e 100644
>> --- a/lisp/emacs-lisp/shorthands.el
>> +++ b/lisp/emacs-lisp/shorthands.el
>> @@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face
>> :version "28.1"
>> :group 'font-lock-faces)
>>
>> -(defun shorthands--mismatch-from-end (str1 str2)
>> - "Tell index of first mismatch in STR1 and STR2, from end.
>> -The index is a valid 0-based index on STR1. Returns nil if STR1
>> -equals STR2. Return 0 if STR1 is a suffix of STR2."
>> - (cl-loop with l1 = (length str1) with l2 = (length str2)
>> - for i from 1
>> - for i1 = (- l1 i) for i2 = (- l2 i)
>> - while (eq (aref str1 i1) (aref str2 i2))
>> - if (zerop i2) return (if (zerop i1) nil i1)
>> - if (zerop i1) return 0
>> - finally (return i1)))
>> -
>> (defun shorthands-font-lock-shorthands (limit)
>> + "Font lock until LIMIT considering `read-symbol-shorthands'."
>> (when read-symbol-shorthands
>> (while (re-search-forward
>> (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>")
>> limit t)
>> (let* ((existing (get-text-property (match-beginning 1) 'face))
>> + (print-name (match-string 1))
>> (probe (and (not (memq existing '(font-lock-comment-face
>> font-lock-string-face)))
>> - (intern-soft (match-string 1))))
>> - (sname (and probe (symbol-name probe)))
>> - (mismatch (and sname (shorthands--mismatch-from-end
>> - (match-string 1) sname)))
>> - (guess (and mismatch (1+ mismatch))))
>> - (when guess
>> - (when (and (< guess (1- (length (match-string 1))))
>> - ;; In bug#67390 we allow other separators
>> - (eq (char-syntax (aref (match-string 1) guess)) ?_))
>> - (setq guess (1+ guess)))
>> + (intern-soft print-name)))
>> + (symbol-name (and probe (symbol-name probe)))
>> + (prefix (and symbol-name
>> + (not (string-equal print-name symbol-name))
>> + (car (assoc print-name
>> + read-symbol-shorthands
>> + #'string-prefix-p)))))
>> + (when prefix
>> (add-face-text-property (match-beginning 1)
>> - (+ (match-beginning 1) guess)
>> + (+ (match-beginning 1) (length prefix))
>> 'elisp-shorthand-font-lock-face))))))
>
> Works well. let-binding `symbol-name' and `print-name' is good improvement.
>
>> (font-lock-add-keywords 'emacs-lisp-mode
>> '((shorthands-font-lock-shorthands)) t)
>> diff --git a/lisp/files.el b/lisp/files.el
>> index 1cdcec23b11..b266d0727ec 100644
>> --- a/lisp/files.el
>> +++ b/lisp/files.el
>> @@ -3735,7 +3735,8 @@ before-hack-local-variables-hook
>> This hook is called only if there is at least one file-local
>> variable to set.")
>>
>> -(defvar permanently-enabled-local-variables '(lexical-binding)
>> +(defvar permanently-enabled-local-variables
>> + '(lexical-binding read-symbol-shorthands)
>> "A list of file-local variables that are always enabled.
>> This overrides any `enable-local-variables' setting.")
>>
>> @@ -4171,6 +4172,13 @@ hack-local-variables--find-variables
>> ;; to use 'thisbuf's name in the
>> ;; warning message.
>> (or (buffer-file-name thisbuf) ""))))))
>> + ((eq var 'read-symbol-shorthands)
>> + ;; Sort automatically by shorthand length
>> + ;; descending
>> + (setq val (sort val
>> + (lambda (sh1 sh2) (>
>> (length (car sh1))
>> +
>> (length (car sh2))))))
>> + (push (cons 'read-symbol-shorthands val) result))
>> ((and (eq var 'mode) handle-mode))
>> (t
>> (ignore-errors
>
> Good catch. I agree that longer shorthands should be applied first.
>
> -----
>
> A couple typo nits on the commit message of "Improve
> shorthands-font-lock-shorthands (bug#67390)":
>
> - h//thingy ; hilits "//" reads to 'hyperdrive--thingy'
> + h//thingy ; hilits "h//" reads to 'hyperdrive--thingy'
>
> - Co-authored-by: João Távora <joaotavora@gmail.com>
> + Co-authored-by: Joseph Turner <joseph@breatheoutbreathe.in>
>
>
> Thank you!
>
> Joseph
Ping!
- bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator,
Joseph Turner <=