bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reductions during Bison error handling


From: Paul Hilfinger
Subject: Re: Reductions during Bison error handling
Date: Mon, 13 May 2002 14:09:30 -0700

 > I realized this when you say it. This "default reduction" is tied to
 > LALR(1), if one changes to LR(1) or something else, it might not even
 > exist, right?

Umm.  It would make sense for LR(1) as well, I should think.  One can
certainly make good use of a default reduction (in this message, I use
this to mean "internal" default reduction---i.e., the most popular
reduction) in any "consistent" state---that is any state consisting of
a single item with the dot at the end or of NO items with dots at the
end.  In the first case, the single reduction becomes the default, and
in the second, error becomes the default.

The whole idea of default reductions is based on two observations: 

a. A row of a table most of whose members are the same can be overlaid
   on other rows by the "yycheck trick", and

b. It is always safe to substitute any reduction that is legal in a state
   for some lookahead for all error entries for that state.  

Point (b) is true for both LR and LALR parsers (because both exhibit
the viable prefix property---they never shift past a legal prefix).
Whether there is some empirical difference in the tables that makes
the trick less useful for LR parsers I confess is one of those (many)
points of parsing trivia I never pursued.


PNH > >  I only see one
PNH > >slight adjustment---states that shift the error token don't use a
PNH > >default reduction (in my sense).  I believe this is to allow parse
PNH > >errors to be caught a little earlier than they might otherwise be, but
PNH > >I am not sure how effective it is at that.

HA> I thought you opted for:

PNH> ...

HA > And that this might have been too difficult to achieve with the current
HA > "error" implementation that relies on a simple error recovery routine in
HA > the parser implementation.

I don't think so.  In fact, I have implemented the documented error
recovery for GLR, and didn't trip over anything.  In fact, I think
that bison.simple's current method is slightly more complex than the
documented method---it allows additional reductions after an error is
detected, but before 'error' is shifted.  In essense, my comment was
that this makes things even more confusing for the user and, with the
use of default reductions, nearly unpredictable.

HA> -- Perhaps a better error recovery can be achieved by implicitly sprinkling
HA> the input grammar with "error_k" tokens, somehow (I do not see the details
HA> of this idea, though).

Actually, there are couple of experiments that might be interesting
(although I have a sneaking suspicion that papers have already been
written about them):

1. What would be the effect on table size and error-recovery behavior
   if Bison were always to use 'error' as the default reduction (so
   that yydefact could be eliminated entirely and 
   yypact[yystate] == YYFLAG only for accepting state).

2. What would be the effect on table size and error-recovery behavior
   if Bison were to use default reductions only for consistent LALR
   states (in current Bison terms, only use the default when
   yypact[yystate] == YYFLAG) and make 'error' the default everywhere
   else.

The intent (besides a slight simplification of the parser) is to cause
Bison to detect parsing errors as soon as the offending token was
consulted (without performing any reductions first). 

The side-effect of each of these would presumably be to increase the
size of yytable, et al.  Furthermore, option (1) would have an
additional side-effect on behavior, namely that a lookahead token
would have to be present (i.e., to have been read by yylex) in ALL
states.  Currently, if you write

        prog : expr ';'  { ACTION($1); }
             | prog expr ';' { ACTION($2); }
             ;

        expr : blah blah blah ... 
             ;

(and "blah blah blah" never ends in a semicolon) then ACTION will be
called before yylex is called to read beyond the semicolon.  This
makes a difference with interactive input from a console, of course.  

The intent of the experiment is to determine whether the cost of these
side-effects is significant, given the (I think desirable) earlier
detection of errors).

Paul Hilfinger



reply via email to

[Prev in Thread] Current Thread [Next in Thread]