emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cc-mode fontification feels random


From: Ergus
Subject: Re: cc-mode fontification feels random
Date: Sat, 12 Jun 2021 13:01:03 +0200

On Sat, Jun 12, 2021 at 09:58:58AM +0300, Eli Zaretskii wrote:
Date: Sat, 12 Jun 2021 03:08:44 +0200
From: Ergus <spacibba@aol.com>
Cc: emacs-devel@gnu.org

BTW: Eli was concerned about the extra copy of the buffer text to send
it to tree-sitter. In this case the time to memcopy an array with all
xdisp text is ~0.00085 seconds.

If the intent is to use buffer-(sub)string, then you forget the price
of consing.  That would trigger frequent GC cycles, which will all but
kill the otherwise fast performance.

Any way if we don't want the copy we can use
ts_parser_set_included_ranges to exclude the gap and pass the text
pointer directly without any copy.

I hope someone will try that and report the results.

The other design issue with TS integration is that I'd like it to plug
into the JIT font-lock interface of the display engine, so that we
don't unnecessarily fontify parts of the buffer that won't be
displayed, and always do fontify the parts that will be.

If I understand something about our cc-mode functionalities (and many of
those functionalities we don't want to loose like indentation and code
navigation). Probably the "right" way to use tree-sitter (maybe Alan
wants give a more precise technical description) is not only fontify but
use the tree information to add contextual information to the text
(something that I think cc-mode does.) And then let font-lock do the
magic.

The tree-sitter tree is basically contextual information, and (for
example) if we have processed the whole buffer and we already have the
tree, then scrolling won't need to parse anything, adding or removing
text is a localized modification, so with the previous tree we can
re-parse only the modified region. The choice may be then if we
propertize the text of the whole buffer or just the visible region OR if
we want to "propertize on demand".

This will save us from the hard parsing in cc-mode to fontify "on the
fly".


I don't
really care if TS actually processes a much larger chunk of text, if
it does that quickly enough, but processing the resulting faces will
take time on the Emacs side, and that is better avoided.

But then we won't get all the contextual information we need for
indentation, code navigation or fold the code right?

so we'll be still "sub-utilizing" the tree sitter features that may give
useful functionalities we already have in cc-mode, and we may also like
to have in other more "limited" modes.

More
importantly, integration into JIT font-lock machinery means we don't
need to use other hooks, which is a step back, since using such hooks
for fontification was already shown to have serious problems in pre-21
Emacs: they don't always catch all the changes which require
re-fontification.

I see two approaches here:

1) add the tree-sitter properties/faces to the buffer text (fully or
partially on the visible regions)

2) use the tree-sitter information directly from the tree and add the
visible properties from there.

This second one will require a more complete api of tree-sitter
functions exposed to elisp, but in my opinion it worth it in accuracy,
speed and simplicity (a single API to rule them all). And to support
many languages we don't actually have like rust or the fancy C++ > 11.
+

Remember that TS has the partial parsing options (specifying the regions
to parse), the re-parsing option (using a previous tree for the same
buffer as a hint which reduces the times abruptly), or even a tree
comparison function that produces a new tree with the differences with
the "hint" tree to know what needs to be updated.

Plus all the navigation function like find parent or child nodes,
parsing error handling, iterate over nodes and so on.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]