bug#65997: 29.1; ?\N{char_name} reference is wrong

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#65997: 29.1; ?\N{char_name} reference is wrong

From:	Robert Pluim
Subject:	bug#65997: 29.1; ?\N{char_name} reference is wrong
Date:	Fri, 15 Sep 2023 17:33:41 +0200

>>>>> On Fri, 15 Sep 2023 22:02:37 +0900, awrhygty@outlook.com said:

    awrhygty> S-exps in the form of ?\N{char_name} return wrong values for some
    awrhygty> characters.
    awrhygty> The S-exp below inserts a whole list of such characters.

    awrhygty> (dotimes (u (1+ (max-char 'ucs)))
    awrhygty>   (let* ((name (get-char-code-property u 'name)))
    awrhygty>     (when (and name (not (<= #xD800 u #xDFFF)))
    awrhygty>       (let ((u2 (condition-case err
    awrhygty>                     (read (format "?\\N{%s}" name))
    awrhygty>                   (error 0))))
    awrhygty>         (unless (eq u u2)
    awrhygty>           (insert (format "%X\t%s\t%X\t%s\n" u name u2
    awrhygty>                           (if (= 0 u2)
    awrhygty>                               "error"
    awrhygty>                             (get-char-code-property u2 
'name)))))))))

For a minute there I thought our hash tables were broken :-). Stefan,
it only took 9 years, but this is no longer true:

lisp/international/mule-cmds.el:

                ;; In theory this code could end up pushing an "old-name" that
                ;; shadows a "new-name" but in practice every time an
                ;; `old-name' conflicts with a `new-name', the newer one has a
                ;; higher code, so it gets pushed later!

The patch below fixes that issue.

    awrhygty> output(TANGUT COMPONENTs are omitted):

I donʼt know why the ranges in `ucs-names' donʼt cover these
code-points. Itʼs easy enough to change them, but theyʼre
explicitly commented out.

    awrhygty> 16FE4     KHITAN SMALL SCRIPT FILLER      0       error
    awrhygty> 16FF0     VIETNAMESE ALTERNATE READING MARK CA    0       error
    awrhygty> 16FF1     VIETNAMESE ALTERNATE READING MARK NHAY  0       error
    awrhygty> 1B132     HIRAGANA LETTER SMALL KO        0       error

And similarly for these 4.

Robert
-- 

diff --git a/lisp/international/mule-cmds.el b/lisp/international/mule-cmds.el
index c26898f7649..254ecae5bd5 100644
--- a/lisp/international/mule-cmds.el
+++ b/lisp/international/mule-cmds.el
@@ -3135,7 +3135,9 @@ ucs-names
                ;; `old-name' conflicts with a `new-name', the newer one has a
                ;; higher code, so it gets pushed later!
                (if new-name (puthash new-name c names))
-               (if old-name (puthash old-name c names))
+                (when (and old-name
+                           (not (gethash old-name names)))
+                  (puthash old-name c names))
                 ;; Unicode uses the spelling "lamda" in character
                 ;; names, instead of "lambda", due to "preferences
                 ;; expressed by the Greek National Body" (Bug#30513).

[Prev in Thread]

Current Thread

[Next in Thread]

bug#65997: 29.1; ?\N{char_name} reference is wrong, awrhygty, 2023/09/15
- bug#65997: 29.1; ?\N{char_name} reference is wrong, Robert Pluim <=
  - bug#65997: 29.1; ?\N{char_name} reference is wrong, Eli Zaretskii, 2023/09/15
    - bug#65997: 29.1; ?\N{char_name} reference is wrong, Re: bug#65997: 29.1; ?\N{char_name} reference is wrong, Robert Pluim, 2023/09/18
    - bug#65997: 29.1; ?\N{char_name} reference is wrong, Re: bug#65997: 29.1; ?\N{char_name} reference is wrong, Eli Zaretskii, 2023/09/18
  - bug#65997: 29.1; ?\N{char_name} reference is wrong, Stefan Monnier, 2023/09/15

Prev by Date: bug#65470: 29.1.50; js-ts-mode: regex pattern can cause incorrect parenthesis matching
Next by Date: bug#65993: 29.1; emoji-insert show nothing without font settings
Previous by thread: bug#65997: 29.1; ?\N{char_name} reference is wrong
Next by thread: bug#65997: 29.1; ?\N{char_name} reference is wrong
Index(es):
- Date
- Thread