[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How does c-ts-mode, tree-sitter indentation, and preprocessor direct
From: |
Filippo Argiolas |
Subject: |
Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work? |
Date: |
Sun, 01 Dec 2024 09:36:34 +0100 |
Yuan Fu <casouri@gmail.com> writes:
>> On Nov 28, 2024, at 10:30 AM, Filippo Argiolas <filippo.argiolas@gmail.com>
>> wrote:
>>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>>>> From: Björn Lindqvist <bjourne@gmail.com>
>>>> Date: Thu, 28 Nov 2024 00:27:17 +0100
>>>>
>>>> I've been trying to get c-ts-mode to indent like I want, but I'm
>>>> running into problems related to preprocessor directives.
>>>
>>> Preprocessor directives are difficult because the tree-sitter C/C++
>>> grammars include only partial support for them.
>>>
>>>> For
>>>> example, consider a type definition nested in two #ifdefs:
>>>>
>>>> #ifdef X
>>>> #ifdef Y
>>>> typedef int foo;
>>>> #endif
>>>> #endif
>>>>
>>>> Since both the parent and grand parent of the type_definition is a
>>>> preproc_ifdef no rule matches.
>>>
>>> But if you go back (up) the parent-child hierarchy, you will
>>> eventually find a node which is not a preproc_SOMETHING, and can go
>>> from there, no?
>>>
>>
>> I believe we might have a bug here, as far as I can tell it does not
>> match
>>
>> ((n-p-gp nil "preproc" "translation_unit") column-0 0)
>>
>> Because both parent and grand parent are preproc. So it matches one of
>> the `c-ts-mode--standalone-parent-skip-preproc' rules right after.
>>
>> After skipping preproc nodes parent is translation_unit and indents an offset
>> from there. Guess this step could be made smarter to check for
>> translation_unit and the rule above could be removed?
>>
>>>> Another issue is that I want my
>>>> preprocessor directives kept at column 0, which unfortunately screws
>>>> up all rules that refer to the parent. E.g.:
>>>>
>>>> ((parent-is "if_statement") standalone-parent 4)
>>>>
>>>> Doesn't work for
>>>>
>>>> int main() {
>>>> if (true)
>>>> #ifdef A
>>>> prutt();
>>>> #else
>>>> fis();
>>>> #endif
>>>> }
>>>>
>>>> The rule I'd like to express is "take the indent of the closest
>>>> *indenting* parent and add one indent". That rule would match whether
>>>> that parent is a "while_statement", "if_statement", "for_statement",
>>>> etc. You can't express such rules with tree-sitter, can you?
>>>
>>> Not sure, but Yuan will know.
>>
>> This can be worked around as Yuan showed, but isn't it a grammar bug?
>> problem is with the #ifdef function and if statement become siblings, without
>> preproc they have a child-parent relation.
>>
>> In my experience c-ts-mode is a bit fragile with preprocessor
>> statements, probably because the grammar itself is fragile (see
>> e.g. [1]) and the problem is an hard one.
>
> Right.
>
>> Yuan, do you think c-ts-mode could some way benefit from LSP knowledge
>> about inactive preprocessor branches? Idea is that we would at least
>> have a good syntax tree in the active branches while allowing some
>> errors in the inactive ones.
>
> Maybe. Technically you can create a parser and sets its range to only
> included the active branches. But for it to work end-to-end would require
> some major effort. I’m not sure if it’s worth it (in terms of code complexity
> and maintenance cost).
Interesting, maybe I'll experiment a bit with it and see where it
goes. Agree that it already sounds overkill for little gain.
My major annoyance more than indent is when the preprocessor statements
break function detection and imenu/breadcrumb. I have one offending file
of this kind at work which unfortunately I cannot share. Will try to
extract a test case that reproduce the issue and open a bug. May be it
can be worked around some way from c-ts-mode.
Filippo