emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reliable after-change-functions (via: Using incremental parsing in E


From: Eli Zaretskii
Subject: Re: Reliable after-change-functions (via: Using incremental parsing in Emacs)
Date: Fri, 03 Apr 2020 21:31:27 +0300

> From: Stephen Leake <address@hidden>
> Date: Fri, 03 Apr 2020 09:45:44 -0800
> 
> > Tree-sitter allows the application to define a "reader" function that
> > it will then call to access buffer text.  That function should cope
> > with the gap.
> 
> and also with the encoding, which you did not address.

I mentioned that in another message: I don't think encoding is
necessary in this case.

> I don't see how that is different from the C level
> buffer-substring. Certainly there should be a module function
> buffer-substring that is as efficient as possible.

If modules are allowed direct access to buffer text, then it's indeed
not different.  But the alternative that was discussed was different.
May I suggest that you look at the code of the module which triggered
this?

> >> You mention "consing of Lisp objects" above, which says to me that the
> >> text is stored in a more complex structure.
> >
> > I meant the consing that is necessary to make a buffer-substring that
> > will be passed to the parser.
> 
> Since are are calling the parser from C (if it is linked into Emacs, or
> in a module), I still don't understand. Does C code have to cons to
> create a string?

If course.  How else do you get a UTF-8 encoded string to pass to the
parser as a copy of buffer text?

> > I don't think tree-sitter does that, because the text it gets is
> > ephemeral.  If we pass it a buffer-substring, it's a temporary string
> > which will be GCed after it's used; if we pass it pointers to buffer
> > text, those pointers can be invalid after GC, because GC can relocate
> > buffer text to a different memory region.
> 
> Hmm.
> https://tree-sitter.github.io/tree-sitter/using-parsers#providing-the-code
> says:
> 
>     Syntax nodes store their position in the source code both in terms
>     of raw bytes and row/column coordinates

Positions are okay; 'char *' pointers to buffer or string text are
not.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]