emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How does c-ts-mode, tree-sitter indentation, and preprocessor direct


From: Yuan Fu
Subject: Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work?
Date: Sun, 1 Dec 2024 01:32:20 -0800


> On Dec 1, 2024, at 12:36 AM, Filippo Argiolas <filippo.argiolas@gmail.com> 
> wrote:
> 
> Yuan Fu <casouri@gmail.com> writes:
> 
>>> On Nov 28, 2024, at 10:30 AM, Filippo Argiolas <filippo.argiolas@gmail.com> 
>>> wrote:
>>> 
>>> Eli Zaretskii <eliz@gnu.org> writes:
>>> 
>>>>> From: Björn Lindqvist <bjourne@gmail.com>
>>>>> Date: Thu, 28 Nov 2024 00:27:17 +0100
>>>>> 
>>>>> I've been trying to get c-ts-mode to indent like I want, but I'm
>>>>> running into problems related to preprocessor directives.
>>>> 
>>>> Preprocessor directives are difficult because the tree-sitter C/C++
>>>> grammars include only partial support for them.
>>>> 
>>>>> For
>>>>> example, consider a type definition nested in two #ifdefs:
>>>>> 
>>>>>   #ifdef X
>>>>>   #ifdef Y
>>>>>   typedef int foo;
>>>>>   #endif
>>>>>   #endif
>>>>> 
>>>>> Since both the parent and grand parent of the type_definition is a
>>>>> preproc_ifdef no rule matches.
>>>> 
>>>> But if you go back (up) the parent-child hierarchy, you will
>>>> eventually find a node which is not a preproc_SOMETHING, and can go
>>>> from there, no?
>>>> 
>>> 
>>> I believe we might have a bug here, as far as I can tell it does not
>>> match
>>> 
>>> ((n-p-gp nil "preproc" "translation_unit") column-0 0)
>>> 
>>> Because both parent and grand parent are preproc. So it matches one of
>>> the `c-ts-mode--standalone-parent-skip-preproc' rules right after.
>>> 
>>> After skipping preproc nodes parent is translation_unit and indents an 
>>> offset
>>> from there. Guess this step could be made smarter to check for
>>> translation_unit and the rule above could be removed?
>>> 
>>>>> Another issue is that I want my
>>>>> preprocessor directives kept at column 0, which unfortunately screws
>>>>> up all rules that refer to the parent. E.g.:
>>>>> 
>>>>>   ((parent-is "if_statement") standalone-parent 4)
>>>>> 
>>>>> Doesn't work for
>>>>> 
>>>>>   int main() {
>>>>>       if (true)
>>>>>   #ifdef A
>>>>>           prutt();
>>>>>   #else
>>>>>           fis();
>>>>>   #endif
>>>>>   }
>>>>> 
>>>>> The rule I'd like to express is "take the indent of the closest
>>>>> *indenting* parent and add one indent". That rule would match whether
>>>>> that parent is a "while_statement", "if_statement", "for_statement",
>>>>> etc. You can't express such rules with tree-sitter, can you?
>>>> 
>>>> Not sure, but Yuan will know.
>>> 
>>> This can be worked around as Yuan showed, but isn't it a grammar bug?
>>> problem is with the #ifdef function and if statement become siblings, 
>>> without
>>> preproc they have a child-parent relation.
>>> 
>>> In my experience c-ts-mode is a bit fragile with preprocessor
>>> statements, probably because the grammar itself is fragile (see
>>> e.g. [1]) and the problem is an hard one.
>> 
>> Right.
>> 
>>> Yuan, do you think c-ts-mode could some way benefit from LSP knowledge
>>> about inactive preprocessor branches? Idea is that we would at least
>>> have a good syntax tree in the active branches while allowing some
>>> errors in the inactive ones.
>> 
>> Maybe. Technically you can create a parser and sets its range to only 
>> included the active branches. But for it to work end-to-end would require 
>> some major effort. I’m not sure if it’s worth it (in terms of code 
>> complexity and maintenance cost).
> 
> Interesting, maybe I'll experiment a bit with it and see where it
> goes. Agree that it already sounds overkill for little gain.
> 
> My major annoyance more than indent is when the preprocessor statements
> break function detection and imenu/breadcrumb. I have one offending file
> of this kind at work which unfortunately I cannot share. Will try to
> extract a test case that reproduce the issue and open a bug. May be it
> can be worked around some way from c-ts-mode.

I share the frustration. Tree-sitter for C could’ve been so much better if 
weren’t for the preprocessor and macros. 

IME, whether it can be worked around depends on the specific code. Some code 
just generates a parse tree that’s hard to recover.

Yuan


reply via email to

[Prev in Thread] Current Thread [Next in Thread]