emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter maturity


From: Yuan Fu
Subject: Re: Tree-sitter maturity
Date: Mon, 23 Dec 2024 17:20:28 -0800


> On Dec 22, 2024, at 4:43 PM, Björn Bidar <bjorn.bidar@thaodan.de> wrote:
> 
> Yuan Fu <casouri@gmail.com> writes:
> 
>>> On Dec 20, 2024, at 1:13 AM, Björn Bidar <bjorn.bidar@thaodan.de> wrote:
>>> 
>>> Yuan Fu <casouri@gmail.com> writes:
>>> 
>>>>> On Dec 18, 2024, at 5:34 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>>>>> 
>>>>>> From: Yuan Fu <casouri@gmail.com>
>>>>>> Date: Tue, 17 Dec 2024 14:11:51 -0800
>>>>>> Cc: Peter Oliver <p.d.oliver@mavit.org.uk>,
>>>>>> Stefan Kangas <stefankangas@gmail.com>,
>>>>>> Emacs Devel <emacs-devel@gnu.org>,
>>>>>> Eli Zaretskii <eliz@gnu.org>
>>>>>> 
>>>>>>>> It’s also worth noting that Tree-sitter itself is somewhat
>>>>>>> immature; the developers say that until it reaches version 1.0, we
>>>>>>> should be wary of potentially unannounced incompatible changes
>>>>>>> (although they are trying harder to avoid this, over time).
>>>>>>> 
>>>>>>> 
>>>>>>> [1] https://build.opensuse.org/package/show/editors/tree-sitter
>>>>>> 
>>>>>> I wonder if we can formalize a way for tree-sitter major modes to
>>>>>> state the compatible version of language grammar it uses. Maybe a
>>>>>> package.el cookies, or a variable that set, or even just comments
>>>>>> in the beginning of the file.
>>>>>> 
>>>>>> Many major modes already adds entries to treesit-language-source-alist, 
>>>>>> that could be a good option too.
>>>>>> 
>>>>>> I especially want built-in major modes to give a version, so that
>>>>>> packagers can package Emacs with the right version of tree-sitter
>>>>>> grammar. I know Eli has problems with pinning a grammar version for
>>>>>> builtin modes before, but I wonder what’s he’s stance now?
>>>>> 
>>>>> What's changed?
>>>> 
>>>> People are starting to package tree-sitter and tree-sitter
>>>> grammars. If Emacs can be packaged with the right grammars, then
>>>> tree-sitter modes will work out-of-the-box.
>>> 
>>> Please don't. That would require nodejs to build Emacs bundled with
>>> these grammars. These grammar packages are also not just used with
>>> Emacs.
>>> 
>>> Grammars are very easy to package once the infrastructure to reuse the
>>> packaging automation in the package manager is there. Don't try to
>>> reinvent that IMHO. If you must generated and build the parser implement
>>> a bindings.gyp parser so you can automate the compilation process
>>> independently of the grammar.
>> 
>> There might be some misunderstanding. We don’t want to build the
>> grammars as part of building Emacs. Ideally building the grammars are
>> the package managers job. We just want to list the versions of
>> grammars that are known to work with the major modes, so packagers
>> have an easier time to package Emacs with the right version of
>> grammars.
> 
> Ah ok now I understand. I don't think that would work.
> 
>>> 
>>> For reference here's my implementation of it in python:
>>> https://build.opensuse.org/projects/editors:tree-sitter/packages/tree-sitter/files/tree-sitter-target.py?expand=1
>>> 
>>>>> 
>>>>> Many language grammars don't make official releases and thus don't
>>>>> have versions.  Moreover, AFAIK there's no API to determine the
>>>>> version of the grammar library we load.  So how can we manage such
>>>>> version-pinning in a way that (a) is up-to-date, and (b) doesn't
>>>>> preclude people from using a grammar library due to false negatives?
>>>> 
>>>> I’m talking about a softer pin. We’re basically providing a “known to
>>>> work” version. This way packagers can package Emacs with a
>>>> known-to-work version of grammar, so the builtin modes work
>>>> out-of-the-box. This doesn’t prevent people from using a newer version
>>>> and sending us a bug report, and we still try our best to make the
>>>> major modes work with the newest grammar.
>>>> 
>>>> If the grammar doesn’t have an explicit version, then we can just use a 
>>>> commit hash. I believe all the packaging systems support that?
>>> 
>>> That doesn't make sense as the versions numbers are arbitrary, e.g. not
>>> always does the version number relate the changes to grammar but also to
>>> the in-tree dependencies in the repository packaging the
>>> language-grammar bindings which have nothing todo with the parser.
>> 
>> Sure, let’s call it snapshot then. I just want to make sure when
>> packagers package Emacs with tree-sitter grammars, the grammar works
>> with Emacs’s major mode.
> 
> The point was that now matter what you call the development of grammars
> is more or less fluent. Maybe there are some more mature grammar but
> those should be the minority.
> But lets just assume for a second it would be possible to freeze or
> recommend the supported grammar versions. The development of grammars is
> to fast for that, especially for builtin modes.
> 
>>> 
>>> What matters much more is the tree-sitter version which is more related
>>> to Emacs itself rather than the particular version of the grammar.
>> 
>> The tree-sitter library version is up to the packagers right? As long as it 
>> satisfies Emacs’ requirements and is compatible with the bundled grammars.
> 
> Do mean bundled or recommended grammars? Grammars bundled would be again
> grammars included within the Emacs sources which is a different thing
> from what I you were saying further above.

Recommended. So packagers control the version of both tree-sitter lib and 
grammars. Emacs will recommend version or commit hash of grammars, and 
packagers will provide Emacs with the grammars that work with the builtin major 
modes.

> 
> Yes the tree-sitter version is up to the package or respectively the
> distribution.
> The only issue that existed regarding was that tree-sitter once broke
> the ABI without bumping the sover but that's fixed now or was fixed when
> Emacs correctly rebuilt once a dependency of it changed.

Yeah, hopefully they’ll be more careful in the future.

Yuan


reply via email to

[Prev in Thread] Current Thread [Next in Thread]