bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61369: Problem with keeping tree-sitter parse tree up-to-date


From: Yuan Fu
Subject: bug#61369: Problem with keeping tree-sitter parse tree up-to-date
Date: Sat, 18 Feb 2023 02:05:04 -0800


> On Feb 17, 2023, at 5:25 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
> 
> On 18/02/2023 03:14, Yuan Fu wrote:
>>> On Feb 17, 2023, at 4:11 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
>>> 
>>> On 18/02/2023 00:32, Yuan Fu wrote:
>>>> Thank you very much! I thought that clipping the change into the fixed 
>>>> visible range, and rely on treesit_sync_visible_region to add back the 
>>>> clipped “tail” when we extend the visible range would be equivalent to not 
>>>> clipping, but I guess clipping and re-adding affects how incremental 
>>>> parsing works inside tree-sitter.
>>> 
>>> It seems like the "repairing" sync used a different range, one that didn't 
>>> include the character number 68 inserted from the beginning.
>>> 
>>> It just synced the 1 or 2 characters at the end of the buffer, the 
>>> difference between the computed visible_end and the actual BUF_ZV_BYTE.
>> That should be enough, no? Because other text didn’t change, they just 
>> moved. And tree-sitter should know that they moved. Or maybe I’m 
>> misunderstanding what you mean.
> 
> But the "unsynced" character is at position 68.
> 
> And we just tell tree-sitter to update positions 134-136. So it stays 
> ignorant of the changed char in the middle of the buffer.
> 
> It's not just about not knowing about the change either (the character in 
> question is a newline, so its absence wouldn't lead to a syntax error), but 
> about wrong offsets in the old parse tree, based on which the new tree is 
> generated. That probably creates a wrong picture of the source text in the 
> parser.

Ok, I made some visualization to understand it, and yeah you are right. I’ll 
need to modify the comment a bit.

|visible range|

updated range
-------------

|aaaaaa|
|bbbbbbbbaaaa|aa  start: 0, old_end: 0, new_end: 6
 ------          
|bbbbbbbbaaaaaa|  start: 12, old_end: 12, new_end: 14
             --


> 
>>>> I don’t think this change would have any adverse effect, because if you 
>>>> think of it, inserting text in a narrowed region always extends the 
>>>> region, rather than pushing text at the end out of the narrowed region. So 
>>>> the right thing to do here is in fact not clipping new_end_offset.
>>> 
>>> I figured it could be a problem if both old_end_byte and new_end_byte 
>>> extend past the current restriction.
>> That should be fine (ie, technically correct), since when we widen, the 
>> clipped text are reparsed by tree-sitter as new text.
> 
> I guess the effect I was thinking of is that
> 
>  XTS_PARSER (lisp_parser)->visible_end
> 
> would end up with a higher value than BUF_ZV_BYTE. Not sure if it's a problem.

It shouldn’t be, since BUF_ZV_BYTE should automatically grow when user inserts 
text. Even if it does, we always call treesit_sync_visible_region to sync up 
visible_beg/end with BUF_(Z)V_BYTE before parsing, so it shouldn’t be a problem.

Yuan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]