emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [SPAM UNSURE] Re: [SPAM UNSURE] Re: Tree Sitter (was Re: cc-mode fon


From: Stephen Leake
Subject: Re: [SPAM UNSURE] Re: [SPAM UNSURE] Re: Tree Sitter (was Re: cc-mode fontification feels random)
Date: Sun, 25 Jul 2021 21:24:26 -0700
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (windows-nt)

Daniel Colascione <dancol@dancol.org> writes:

> On 7/24/21 1:05 PM, Stephen Leake wrote:
>
>> Daniel Colascione <dancol@dancol.org> writes:
>>
>>> On 7/21/21 12:15 PM, Perry E. Metzger wrote:
>>>> On 7/21/21 12:21, Daniel Colascione wrote:
>>>>> On 7/21/21 7:43 AM, Perry E. Metzger wrote:
>>>>>> Thought I would note that there's a substantial literature now on
>>>>>> incremental parsing, especially the sort that is needed for editor
>>>>>> tools. One doesn't need to reinvent the algorithms, they're out
>>>>>> there waiting to be used. The Tree Sitter project is based on
>>>>>> previous published work.
>>>>> There is indeed a big literature! I wish there were a bigger
>>>>> literature on *composable* incremental parsers though. IMHO, what
>>>>> we need is an incremental GLR system (yes, GLR is bad worst-case,
>>>>> but it's not a practical concern) that spits out a parse *forest*
>>>>> which we then pare down to a parse tree with ad-hoc syntactic
>>>>> consistency rules. Something like this naturally supports
>>>>> multi-language modes and incorporation of out-of-band semantic
>>>>> information.
>>>>>
>>>> Tree sitter handles GLR.
>>>>
>>> Cool. How does it prune the parse forest?
>> wisi also uses GLR. It prunes trees during parse when the parse stacks
>> contained in the trees are identical; it uses error recover cost and
>> length to decide which tree to delete, or picks one at random. It's an
>> error if more than one tree is alive at the end of parse. That's because
>> programming languages must be unambiguous. It would be possible to adapt
>> the wisi parser to use some other pruning strategy.
>
>
> Programs *as a whole*, properly understood by a compiler or execution
> environment, must be unambiguous. That's true. But when we're editing,
> we're dealing with program fragments, sometimes damaged by user
> modifications, and have to do our best given fragmentary information.

Right. That's why wisi has robust error recovery.

> All I'm suggesting is that it'd be useful to use language-specific
> semantic rules to disambiguate parse trees: 

So far, wisi is only used for Ada; I did not need any disambiguation
rules that seemed language-specific. That may change when/if other
languages use wisi.

> for example, if in location L1, symbol T can be a type or a name, and
> in location L2, symbol T is definitely a type, then we should regard
> symbol T as a type in location L1 too. 

That might be possible, but it adds a layer of semantic analysis that
could be slow.

-- 
-- Stephe



reply via email to

[Prev in Thread] Current Thread [Next in Thread]