bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#36447: 27.0.50; New "Unknown keyword" errors


From: Pip Cet
Subject: bug#36447: 27.0.50; New "Unknown keyword" errors
Date: Fri, 5 Jul 2019 09:09:13 +0000

On Fri, Jul 5, 2019 at 8:41 AM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Fri, 5 Jul 2019 08:36:57 +0000
> > Cc: michael_heerdegen@web.de, npostavs@gmail.com, 36447@debbugs.gnu.org
> >
> > > > I don't think we can sensibly add tests for this bug, but the fix I
> > > > posted earlier still seems valid to me.
> > >
> > > Sorry, I'm not tracking this part of the discussion, as it lost me
> > > long ago.
> >
> > What's the best way of getting this fixed?
>
> Sorry, I don't think I know what "this bug" is about,

The bug:
Building emacs with "-O0 -g3 -ggdb" on current Linux will result in
binaries that sometimes, depending on the precise compiler version
used, will fail weirdly if you evaluate this emacs -Q recipe line by
line:

(custom-handle-keyword nil :group nil nil)
(y-or-n-p "prompt")
(custom-handle-keyword nil :group nil nil)

The error produced will be "unknown keyword :group", which is
nonsensical as :group is indeed a valid keyword.


The analysis:
It's not the byte code, which is fine and looks like this:

byte code for custom-handle-keyword:
  doc:  For customization option SYMBOL, handle KEYWORD with VALUE. ...
  args: (arg1 arg2 arg3 arg4)
0    varref      purify-flag
1    goto-if-nil 1
4    constant  purecopy
5    stack-ref 2
6    call      1
7    stack-set 2
9:1    stack-ref 2
10    constant  <jump-table-eq (:group 2 :version 3 :package-version 4
:link 5 :load 6 :tag 7 :set-after 8)>
11    switch
12    goto      9
...
52:9    constant  error
53    constant  "Unknown keyword %s"
54    stack-ref 4
55    call      2
56    return

Note that the code uses a jump table, which is a hash table mapping
keys to integers for the "switch" op.

This is where hash tables come in.

We can inspect the hash table `custom-handle-keyword' uses by evaluating

(aref (aref (symbol-function #'custom-handle-keyword) 2) 2)

The hash table prints fine.

But investigating its C in-memory representation, we find that the
hash collision chains, stored in the ->next vector, are corrupted.

It turns out that this is because hash_table_rehash was called on a
different hash table which had the same ->next vector, but different
contents.

That's the problem I fixed: for reasons explained below, we sometimes
see two hash tables with the same ->next vector, then try to rehash
both of them, obtaining different results. Last caller wins, first
caller gets the corruption (each hash table is rehashed at most once).

The reasons are this: when a hash table is purecopied, its ->next
vector is purecopied, which merges it with another, similar, hash
table's ->next vector if purify-flag is a (third) hash table. The
vectors are compared using `equal', but the pure copies are actually
`eq'.

This worked fine with the old dumper, because we never modified pure
storage. However, with the current pdumper code, we have to do that.

The (disappointingly trivial) fix:
call copy-sequence on h->next before rehashing the table. This will
make h->next impure, which is good since we're going to modify it.
While we're there, do the same for the other vectors used in the hash
table representation, except for h->key_and_value, which we need not
touch.

> and how is the issue with hash tables relevant.

The bytecode is executing incorrectly because it relies on a
purecopied hash table, which is effectively part of the compiled
function. The hash table has become corrupted.

Attachment: 0002-Don-t-alter-shared-structure-in-dumped-purecopied-ha.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]