emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter api


From: Yuan Fu
Subject: Re: Tree-sitter api
Date: Sun, 12 Sep 2021 21:15:31 -0700


> On Sep 11, 2021, at 10:39 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Sat, 11 Sep 2021 13:29:09 -0700
>> Cc: Tuấn-Anh Nguyễn <ubolonton@gmail.com>,
>> Theodor Thornhill <theo@thornhill.no>,
>> Clément Pit-Claudel <cpitclaudel@gmail.com>,
>> Emacs developers <emacs-devel@gnu.org>,
>> Stefan Monnier <monnier@iro.umontreal.ca>,
>> stephen_leake@stephe-leake.org
>> 
>>> But the <lang> part is still needed to be concocted somehow.  E.g.,
>>> the conversion from "C#" to "c-sharp" isn't trivial.
>> 
>> The project name of tree-sitter’s C# definition is “tree-sitter-c-sharp”[1]. 
>> So if someone wants to use the C# language, they probably know what symbol 
>> represents it (we will explain the translation rule in doc-string and the 
>> manual). I also want to point out that we don’t come up with the symbols 
>> representing each language, the _user_ passes 'tree-sitter-parser-create' a 
>> symbol representing a language, and we translate that symbol to dynamic 
>> library name and C symbol name.
> 
> Surely, you don't mean "user" as in "the person who edits a source
> file"?  I presume you mean the Lisp program, not the human user.  That
> Lisp program is the major mode which wants to use TS services, and the
> only thing that it has in hand is its own symbol, like 'c-mode' or
> 'python-mode' or 'f90-mode'.  It needs a way to pass the corresponding
> TS module name to TS, and my question is: how would the major mode
> compute the correct module name?  We need either a mode-specific
> variable with that name, or some global function that could be used by
> any major mode to obtain the language module name.

Not the end-user, no. But not really “Lisp Program”, either. I mean the human 
being writing the major-mode and adapting the major-mode to utilize tree-sitter 
features. The major mode writer should be able to figure out the correct symbol 
to use, if she go checks out the project name for the language definition, or 
the package name of the language definition in her package manager, or by some 
other means. For example, one should be able to figure out that tree-sitter-c 
is the symbol for C language definition, and tree-sitter-c-sharp that C#. Then 
Emacs automatically translate tree-sitter-c to libtree-sitter-c.so, and 
tree-sitter-c-sharp to libtree-sitter-c-sharp.so; basically adding “lib” and 
“.so” (or “dylib” etc). If that doesn’t give the correct library name for a 
quirky language, the major-mode writer can add an entry to 
tree-sitter-library-name-override-list—(tree-sitter-quirky-lang 
“libtree-sitter-qlang” “tree_sitter_qlang”)—and Emacs will use that. (Or she 
can just use tree-sitter-qlang as the symbol, and Emacs’ auto translation would 
just fine.)

> 
>>>>> BTW, since dynamic libraries has different extensions on different 
>>>>> systems, what I want to do it to try loading the library with .so, then 
>>>>> try .dylib, then try .dll, is that a good idea?
>>>> 
>>>> We can do better, see load-suffixes.
>>> 
>>> And in C, you can use MODULES_SUFFIX directly.  Though we will
>>> probably need some minor changes there, to have the suffix defined
>>> even in a build --without-modules.
>> 
>> I’m using tree-sitter-load-suffixes with default value ‘(“.so”, “.dylib”, 
>> “.dll”). Should I populate this variable with MODULES_SUFFIX and 
>> MODULES_SECONDARY_SUFFIX, or should I just use the two SUFFIX in C? I.e., do 
>> you see a need for users to customize suffixes?
> 
> I'd prefer a general variable shared-library-suffix(es), either a
> single value specific to the target system or an alist with keys being
> system names (from system-type).  Then we could use that in
> load-suffixes (instead of MODULES_SUFFIX) and everywhere else.

To summarize, we have 

        "load-suffixes” (".elc" ".el”, with M_SUFFIX & M_SEC_SUFFIX if modules 
enabled), 
        "module-file-suffix” (M_SUFFIX if modules enabled), 
        "load-file-rep-suffixes” ("" ".gz"). 

All contribute to the possible file names Emacs tries when loading a file (be 
it a Elisp file or an Emacs module). I will add a "shared-library-suffix” 
specifically for loading dynamic libraries, its value will be MODULES_SUFFIX 
regardless if module is enabled.

Yuan





reply via email to

[Prev in Thread] Current Thread [Next in Thread]