emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reliable after-change-functions (via: Using incremental parsing in E


From: Eli Zaretskii
Subject: Re: Reliable after-change-functions (via: Using incremental parsing in Emacs)
Date: Fri, 03 Apr 2020 10:43:53 +0300

> From: Stephen Leake <address@hidden>
> Date: Thu, 02 Apr 2020 18:27:59 -0800
> 
> > Such copying is not really scalable, and IMO should be avoided.
> > During active editing, redisplay runs very frequently, and having to
> > copy portions of the buffer, let alone all of it, each time, which
> > necessarily requires memory allocation, consing of Lisp objects, etc.,
> > will produce significant memory pressure, expensive heap
> > allocations/deallocations, and a lot of GC.  Recall that on many
> > modern platforms Emacs doesn't really return memory to the system,
> > which means we risk increasing the memory footprint, and create
> > system-wide memory pressure.  It isn't a catastrophe, but we should
> > try to avoid it if possible.
> 
> Ok. I know very little about the internal storage of text in Emacs.
> There is at least two strings with a gap at the current edit point; if
> we pass a simple pointer to tree-sitter, it will have to handle the gap.

Tree-sitter allows the application to define a "reader" function that
it will then call to access buffer text.  That function should cope
with the gap.

> You mention "consing of Lisp objects" above, which says to me that the
> text is stored in a more complex structure.

I meant the consing that is necessary to make a buffer-substring that
will be passed to the parser.

> How can we provide direct access of that to tree-sitter?

See above: by writing our function to access buffer text.

> Avoid _all_ copying is impossible; the parser must store the contents of
> each token in some way. Typically that is done by storing
> pointers/indices into the text buffer that contains the entire text.

I don't think tree-sitter does that, because the text it gets is
ephemeral.  If we pass it a buffer-substring, it's a temporary string
which will be GCed after it's used; if we pass it pointers to buffer
text, those pointers can be invalid after GC, because GC can relocate
buffer text to a different memory region.

They definitely do copy portions of the text they get for internal
processing purposes, but I doubt that they duplicate all of it,
because that would not be scalable to huge buffers.  And in any case,
any copying we do would be _in_addition_ to what tree-sitter does
internally.

> >> In sum, the short answer is "yes, you must parse the whole file, unless
> >> your language is particularly simple".
> >
> > Funny, my conclusion from reading your detailed description was
> > entirely different.
> 
> I need more than that to respond in a helpful way.

Well, you said:

> To some extent, that depends on the language.

and then went on to describing how each language might _not_ need a
full parse in many cases.  Thus the conclusion sounded a bit radical
to me.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]