[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#59574: 29.0.50; Emacs crashes when using tree-sitter-based mode in a
From: |
Yuan Fu |
Subject: |
bug#59574: 29.0.50; Emacs crashes when using tree-sitter-based mode in an empty buffer |
Date: |
Fri, 25 Nov 2022 19:18:09 -0800 |
> On Nov 25, 2022, at 7:04 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>
> To reproduce:
>
> emacs -Q
> C-x C-f foo.c RET
> M-x c-ts-mode RET
> Type "in"
Thanks for finding this out!
>
> Make sure foo.c doesn't exist, so you start from an empty buffer. As soon
> as you type the second character of "in", there's an assertion violation:
>
> treesit.c:1383: Emacs fatal error: assertion failed: end_byte <= BUF_ZV_BYTE
> (bu
> ffer)
>
> Thread 1 hit Breakpoint 1, terminate_due_to_signal (sig=22,
> backtrace_limit=2147483647) at emacs.c:427
> 427 signal (sig, SIG_DFL);
> (gdb) up
> #1 0x01230802 in die (
> msg=0x18e6778 <DEFAULT_REHASH_SIZE+3288> "end_byte <= BUF_ZV_BYTE
> (buffer)", file=0x18e5fcc <DEFAULT_REHASH_SIZE+1324> "treesit.c", line=1383)
> at alloc.c:7697
> 7697 terminate_due_to_signal (SIGABRT, INT_MAX);
> (gdb)
> #2 0x01355636 in treesit_make_ranges (ranges=0x856a778, len=1,
> buffer=0x7fe94b0) at treesit.c:1383
> 1383 eassert (end_byte <= BUF_ZV_BYTE (buffer));
> (gdb) p end_byte
> $1 = 4
> (gdb) p BUF_ZV_BYTE(buffer)
> $2 = 3
>
> Interestingly, this only happens once, when the buffer includes exactly 1
> byte and an additional character is inserted. If you get past this
> assertion, further characters can be inserted without any problems, and
> end_byte always equals BUF_ZV_BYTE.
>
> The backtrace is below, if it is interesting.
>
> I couldn't figure out where did tree-sitter take the range it returns to us.
> Yuan, can you describe how does the parser get the range it needs to
> consider? If I put a breakpoint in treesit-parser-set-included-ranges, the
> breakpoint never breaks, so this doesn't seem to be how the range is set in
> this scenario.
After we parse the buffer (in treesit_ensure_parsed) we compute the ranges that
has changed since last parse, by calling ts_tree_get_changed_ranges, and pass
the ranges to notifier functions (those added by treesit-parser-add-notifier).
This range is different from the range within which a parser operates. That
range is set by treesit-parser-set-included-ranges, and is not involved with
the parsing, treesit_record_changes, visible_beg/end stuff.
Both feature happens to use treesit_make_ranges as a helper function, but the
similarity ends there.
> There's also something strange in treesit_record_change: when it is called
> for the first time in a buffer which was empty and you insert one character,
> we bypass the updating of visible_beg and visible_end fields of the Lisp
> parser object, because XTS_PARSER (lisp_parser)->tree is NULL. But it looks
> to me that we should still update these two fields regardless, no? Only the
> call to treesit_tree_edit_1 needs the tree. (I thought that maybe this lack
> of update explains the assertion, but even if I move the condition to guard
> only treesit_tree_edit_1, the assertion still happens, so I guess my
> hypothesis eats dust.)
We don’t need to update visible_beg/end in treesit_record_change if tree is
NULL, because visible_beg/end represents the range of buffer that the tree
sees, so if there is no tree, visible_beg/end can be considered uninitialized.
However you are right about needing to update visible_beg/end, but in
treesit_ensure_position_synced (I renamed it to treesit_sync_visible_region):
that’s where we ensure visible_beg/end equals to BUF_BEGV_BYTE/friends.
The problem is we don’t update visible_beg/end for the very first parse, when
tree is NULL.
I also added some comments, hopefully they sufficiently explain everything.
Yuan