emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: emacs-tree-sitter and Emacs


From: Stephen Leake
Subject: Re: emacs-tree-sitter and Emacs
Date: Wed, 01 Apr 2020 11:51:40 -0800
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (windows-nt)

Eli Zaretskii <address@hidden> writes:

>> From: Stephen Leake <address@hidden>
>> Date: Tue, 31 Mar 2020 16:27:35 -0800
>> 
>> > OTOH, using an after-change hook has its downsides, even if disregard
>> > slow-down (which I wouldn't).
>> 
>> In wisi (used by ada-mode), the after-change hooks just record what
>> regions have been changed; font-lock then triggers a parse if the region
>> being fontified contains or is after a change region. Navigation and
>> indent also trigger parses.
>
> Can you tell in more detail why you need to rely on these hooks?  They
> shouldn't be necessary, AFAIU.  

It is an optimization choice.

In an unmodified buffer, that is smaller than 100,000 characters
(default setting of wisi-partial-parse-threshold), the entire buffer is
parsed once; that applies faces to all the Ada identifiers that need
faces (standard font-lock regexp handles the reserved words). Then when
font-lock fontifies a region, no parsing is needed.

Indent is similar; the parse sets text properties holding the indent for
each line; indent-region then applies them.

When the user starts editing, and font-lock is requested, we need
to know the changes before the font-lock region, because that can affect
the interpretation of the code in the region. Worst case is adding an
opening ". Adding/deleting "begin" or "end" can change indent
(equivalent to adding/deleting { or } in C).

If the default setting of jit-lock-defer-time (ie nil) is used, then
font-lock runs immediately after each change, and the after-change hooks
are not needed. But as I have mentioned, I always run with
jit-lock-defer-time set to 1.0 (because parsing is not fast enough in
some cases), so the change hooks are needed.

In addition, indent-region is run when the user types return or tab (or
otherwise invokes indent); there can easily be changes outside the line
or region begin indented, again requiring change hooks.

The alternative to not requiring after-change hooks is to always do a full
parse, for ever call of fontify-region or indent-region. That is far too
slow.

Note that Tree-Sitter requires one full parse of the buffer to generate
the parse tree that is later updated incrementally; in an unmodified
buffer, only that one parse is needed.

wisi can handle parsing only a small part of the file, but it produces
incorrect results more often in that case, since it relies on
error-correction to arrive at correct syntax. That's why partial parse
is only used on very large files, where the parser is _always_ too slow;
in most files, it is only too slow when there is a bad syntax error, and
recover is slow.

> And they cannot pick up every relevant change; for example, what
> happens if some face used for font-lock is modified?

Yes, that is a flaw. Not likely to occur in everyday use, and wisi
provides wisi-parse-buffer to force a full parse for such situations.

As I mentioned elsewhere, wisi provides wisi-inhibit-parse for use when
an elisp author might be tempted to use inhibit-modification-hooks. 

>> By default font-lock runs after every character typed
>
> No, it only runs when redisplay kicks in.  If you type very quickly,
> it won't run for every character.  At least AFAIR.

What triggers redisplay?

In practice, I and other ada-mode users notice font-lock running after
each character, with the default setting of jit-lock-defer-time. There
is a comment in jit-lock.el indicating that the default value may have
been 0.25 at one point (I did not check the git history); perhaps you
are remembering that behavior?

For example, in Ada the comment-start is "--". No matter how fast I type
the two chars, ada-mode reports a syntax error after the first one.
Syntax errors are detected by a parse and reported via fringe marks, as
in flymake; it blinks after the key is pressed twice, appearing after
the first character is displayed, disappearing after the second is
displayed. (I have just retested this in emacs from master).

I don't think there's anything in ada-mode that forces a redisplay
(except explicitly calling wisi-parse-buffer; that calls
font-lock-ensure). But I'd be happy to investigate further if you are
sure it should not work this way.

The elisp manual section "Forcing redisplay" says "Emacs normally tries
to redisplay the screen whenever it waits for input." After I type the
first character, it is no longer waiting for input, it is processing
that character. I assume here "process that char code" includes running
after-change-functions, which is (small) elisp code. But I guess after
processing that char, before calling redisplay, it checks if there is
more input, which should be true if I type fast enough. Perhaps "process
that char code" is faster than the combination of my fingers and the
keyboard char send rate?

Hmm. M-x (execute-kbd-macro "--") does not show a syntax-error fringe
blink. I'm not sure if that is relevant here.

>> which is often too slow in an ada-mode buffer; I always set
>> jit-lock-defer-time to 1.0 seconds.
>
> That's too long to be pleasant on display, IMO.  A second is a very
> long time in this context.

Other people have made the same complaint. I'm probably biased in
accepting the slow parser behavior (I know how hard it would be to
improve it :). Migrating from an external process to a module might
help. Changing from partial parse to incremental parse might help.

Setting jit-lock-defer-time to 0.25 eliminates the fringe blink when
typing "--". If I watch very closely, I can just barely see the delay
between displaying the last char and the change of color (from black to
red).

I'll run with 0.25 for a while; the parser has gotten better since the
last time I changed that, so maybe that's good enough now.

I mentioned above that the parser is only too slow when there is a bad
syntax error, and recover is slow. However, that is the typical case
while editing code. 

-- 
-- Stephe



reply via email to

[Prev in Thread] Current Thread [Next in Thread]