emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: treesitter local parser: huge slowdown and memory usage in a long fi


From: Dmitry Gutov
Subject: Re: treesitter local parser: huge slowdown and memory usage in a long file
Date: Thu, 23 May 2024 02:42:53 +0300
User-agent: Mozilla Thunderbird

On 22/05/2024 08:51, Yuan Fu wrote:
This listener would be specific for a particular consumer. In our case, we'd have a listener which would 
populate - and then update - the variable used by treesit--pre-redisplay. That variable would store the 
"up to date" list of updated ranges. The listener, on every call, would "merge" its 
current value one with the new list of ranges (*). treesit--pre-redisplay would use the data in that data 
structure instead of calling treesit-parser-changed-ranges, and set the value to nil to "reset" it 
for the next update.

(*) So real "merging" would only need to be performed when listener fires 2+ 
times between the two adjacent treesit--pre-redisplay calls. Otherwise the current value 
is nil, so the the new list is simply assigned to the variable. Anyway, the merging logic 
seems to be the trickiest part in this scheme (managing and interpreting offsets), but it 
should be very similar in both approaches.
I agree. The usefulness of treesit-parser-changed-ranges aren’t really 
justified at this point (well, except that it makes the caller’s code much 
easier to follow).

That it does.

Let me implement what you described and let’s see how it goes.

Thank you, looking forward to it!

I think we don’t even need to merge the ranges (which will be prone to bugs if 
I were to write it 😉, we can just push the new ranges to a list and later 
process them one by one.

I think this might amount to the same thing (merging when generating, or merging when processing). It seems there will also be a small issue of "kinds" of ranges?..

Like for example suppose we have two consecutive operations which insert new characters in range 200..300. The result should be a range that spans 200..400, right?

But if one operation just changes text in that range (keeping its length intact, e.g. capitalizing the whole region), and another does the same (back to lower case), then the combined range would remain 200..300.

Computing that might be difficult without having access to the kinds of changes are being done (does tree-sitter report those?). OTOH, most of the time the most important part is the position of the beginning of the changes (e.g. for syntax-ppss), and we could treat the rest of the buffer as invalidated...



reply via email to

[Prev in Thread] Current Thread [Next in Thread]