bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#65051: internal_equal manipulates symbols with position without chec


From: Mattias Engdegård
Subject: bug#65051: internal_equal manipulates symbols with position without checking symbols-with-pos-enabled.
Date: Sun, 6 Aug 2023 15:37:24 +0200

5 aug. 2023 kl. 23.07 skrev Alan Mackenzie <acm@muc.de>:

> diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
> index 34db0caf3a8..a828d303c04 100644
> --- a/doc/lispref/symbols.texi
> +++ b/doc/lispref/symbols.texi
> @@ -784,9 +784,15 @@ Symbols with Position
> @cindex bare symbol
> A @dfn{symbol with position} is a symbol, the @dfn{bare symbol},
> together with an unsigned integer called the @dfn{position}.  These
> -objects are intended for use by the byte compiler, which records in
> -them the position of each symbol occurrence and uses those positions
> -in warning and error messages.
> +objects are stored internally much like vectors

Not sure why we want to say how they are stored here. They can be stored in 
bubble memory for all the user cares.

> , and don't themselves
> +have entries in the obarray (though their bare symbols do;
> +@pxref{Creating Symbols}).
> +
> +Symbols with position are for the use of the byte compiler, which
> +records in them the position of each symbol occurrence and uses those
> +positions in warning and error messages.  They shouldn't normally be
> +used otherwise.  Doing so can cause unexpected results with basic
> +Emacs functions such as @code{eq} and @code{equal}.
> 
> The printed representation of a symbol with position uses the hash
> notation outlined in @ref{Printed Representation}.  It looks like
> @@ -798,11 +804,20 @@ Symbols with Position
> 
> For most purposes, when the flag variable
> @code{symbols-with-pos-enabled} is non-@code{nil}, symbols with
> -positions behave just as bare symbols do.  For example, @samp{(eq
> -#<symbol foo at 12345> foo)} has a value @code{t} when that variable
> -is set (but @code{nil} when it isn't set).  Most of the time in Emacs this
> -variable is @code{nil}, but the byte compiler binds it to @code{t}
> -when it runs.
> +positions behave just as their bare symbols would.  For example,
> +@samp{(eq #<symbol foo at 12345> foo)} has a value @code{t} when the
> +variable is set; likewise, @code{equal} will treat a symbol with
> +position argument as its bare symbol.
> +
> +When @code{symbols-with-pos-enabled} is @code{nil}, any symbols with
> +position continue to exist, but do not behave as symbols, or have the
> +other useful properties outlined in the previous paragraph.  @code{eq}
> +returns @code{t} when given identical arguments, and @code{equal}
> +returns @code{t} when given arguments with @code{equal} components.

Since the components are bare symbols and fixnums, equality and identity for 
them are equivalent, right?

> +
> +Most of the time in Emacs @code{symbols-with-pos-enabled} is
> +@code{nil}, but the byte compiler and the native compiler bind it to
> +@code{t} when they run.
> 
> Typically, symbols with position are created by the byte compiler
> calling the reader function @code{read-positioning-symbols}
> @@ -820,7 +835,7 @@ Symbols with Position
> a symbol with position, ignoring the position.
> @end defvar
> 
> -@defun symbol-with-pos-p symbol.
> +@defun symbol-with-pos-p symbol
> This function returns @code{t} if @var{symbol} is a symbol with
> position, @code{nil} otherwise.
> @end defun
> diff --git a/src/fns.c b/src/fns.c
> index bfd19e8c8f2..d47098c8791 100644
> --- a/src/fns.c
> +++ b/src/fns.c
> @@ -2773,10 +2773,13 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum 
> equal_kind equal_kind,
> 
>   /* A symbol with position compares the contained symbol, and is
>      `equal' to the corresponding ordinary symbol.  */
> -  if (SYMBOL_WITH_POS_P (o1))
> -    o1 = SYMBOL_WITH_POS_SYM (o1);
> -  if (SYMBOL_WITH_POS_P (o2))
> -    o2 = SYMBOL_WITH_POS_SYM (o2);
> +  if (symbols_with_pos_enabled)
> +    {
> +      if (SYMBOL_WITH_POS_P (o1))
> +     o1 = SYMBOL_WITH_POS_SYM (o1);
> +      if (SYMBOL_WITH_POS_P (o2))
> +     o2 = SYMBOL_WITH_POS_SYM (o2);
> +    }

OK. This reduces the number of branches in the hot path for ordinary 
(non-sympos) code by one while adding one to sym-pos code, and that should be a 
fair trade-off. The new branch should be well-predicted but is still consuming 
resources.

>   if (BASE_EQ (o1, o2))
>     return true;
> @@ -2824,8 +2827,8 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum 
> equal_kind equal_kind,
>       if (ASIZE (o2) != size)
>         return false;
> 
> -     /* Compare bignums, overlays, markers, and boolvectors
> -        specially, by comparing their values.  */
> +     /* Compare bignums, overlays, markers, boolvectors, and
> +        symbols with position specially, by comparing their values.  */
>       if (BIGNUMP (o1))
>         return mpz_cmp (*xbignum_val (o1), *xbignum_val (o2)) == 0;
>       if (OVERLAYP (o1))
> @@ -2857,6 +2860,13 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum 
> equal_kind equal_kind,
>       if (TS_NODEP (o1))
>         return treesit_node_eq (o1, o2);
> #endif
> +     if (SYMBOL_WITH_POS_P(o1)) /* symbols_with_pos_enabled is false.  */
> +       return (internal_equal (XSYMBOL_WITH_POS (o1)->sym,
> +                               XSYMBOL_WITH_POS (o2)->sym,
> +                               equal_kind, depth + 1, ht)
> +               && internal_equal (XSYMBOL_WITH_POS (o1)->pos,
> +                                  XSYMBOL_WITH_POS (o2)->pos,
> +                                  equal_kind, depth + 1, ht));

Why recurse here if the components are a bare symbol and a fixnum, respectively?

>       /* Aside from them, only true vectors, char-tables, compiled
>          functions, and fonts (font-spec, font-entity, font-object)
> diff --git a/test/src/fns-tests.el b/test/src/fns-tests.el
> index 79ae4393f40..9c09e4f0c33 100644
> --- a/test/src/fns-tests.el
> +++ b/test/src/fns-tests.el
> @@ -98,6 +98,26 @@
>   (should-not (equal-including-properties #("a" 0 1 (k "v"))
>                                           #("b" 0 1 (k "v")))))
> 
> +(ert-deftest fns-tests-equal-symbols-with-position ()
> +  "Test `eq' and `equal' on symbols with position."
> +  (let ((foo1 (position-symbol 'foo 42))
> +        (foo2 (position-symbol 'foo 666))
> +        (foo3 (position-symbol 'foo 42)))
> +    (let (symbols-with-pos-enabled)
> +      (should (eq foo1 foo1))

Thank you! There is nothing wrong with the coverage of these tests with respect 
to your changes.

However we should make an effort to prevent the compiler from optimising (eq X 
X) -> t etc, which it is completely entitled to doing, and also test both the 
interpreted and compiled version of `eq` and `equal`.

The test bytecomp--eq-symbols-with-pos-enabled already does most of this for a 
different reason. Perhaps it can be extended to cover `equal` as well?






reply via email to

[Prev in Thread] Current Thread [Next in Thread]