bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LAC (lookahead correction) for syntax error handling


From: Joel E. Denny
Subject: Re: LAC (lookahead correction) for syntax error handling
Date: Sun, 6 Mar 2011 17:24:03 -0500 (EST)
User-agent: Alpine 2.00 (DEB 1167 2008-08-23)

On Sun, 20 Feb 2011, Joel E. Denny wrote:

> Subject: [PATCH] doc: add bibliography to manual.
> 
> * doc/bison.texinfo (Mystery Conflicts): Cross-reference
> bibliography instead of citing publications directly.
> (Generalized LR Parsing): Likewise.
> (Bibliography): New section.  Not all entries are cross-referenced
> yet, but that will come in future patches.

I pushed that and the following patches to branch-2.5.  I pushed similar 
patches to master.

>From 6f04ee6c78ba01f9d8e02dbe2baace0c3bd8f4fd Mon Sep 17 00:00:00 2001
From: Joel E. Denny <address@hidden>
Date: Mon, 21 Feb 2011 19:09:24 -0500
Subject: [PATCH 1/4] doc: create a new Tuning LR section in the manual.

And clean up all other documentation of the features described
there.
* NEWS (2.5): Tweak wording of lr.type and parse.lac entries a
bit, update the cross-references to the manual, and point out that
LAC has caveats.  Don't be so adamant that IELR+LAC=canonical LR.
That is, as the referenced section in the manual documents, LAC
does not fix infinite parsing loops on syntax errors.
* doc/bison.texinfo: Consistently drop the "(1)" suffix from LALR,
IELR, and LR in @cindex.
(%define Summary): Condense the entries for lr.default-reductions,
lr.keep-unreachable-states, lr.type, and parse.lac into brief
summaries, and cross-reference the appropriate subsections of
Tuning LR.  For parse.lac, mention that it's only implemented for
deterministic parsers in C.
(Error Reporting): When mentioning %error-verbose, mention LAC,
and add cross-reference to the LAC section.
(Tuning LR): New section with an extended version of the
documentation removed from %define Summary.  Change all
cross-references in the manual to point here instead of there.
(Calc++ Parser): When mentioning %error-verbose, mention LAC, and
add cross-reference to the LAC section.
(Table of Symbols): In %error-verbose and YYERROR_VERBOSE entries,
add cross-references to Error Reporting.
(Glossary): Capitalize entry titles consistently.  Add definitions
for "defaulted state" and "unreachable state".  Expand IELR
acronym in IELR's entry.
---
 ChangeLog         |   30 ++
 NEWS              |   44 ++--
 doc/bison.texinfo |  782 +++++++++++++++++++++++++++++++++--------------------
 3 files changed, 541 insertions(+), 315 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 82f6643..dacfddc 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,33 @@
+2011-03-06  Joel E. Denny  <address@hidden>
+
+       doc: create a new Tuning LR section in the manual.
+       And clean up all other documentation of the features described
+       there.
+       * NEWS (2.5): Tweak wording of lr.type and parse.lac entries a
+       bit, update the cross-references to the manual, and point out that
+       LAC has caveats.  Don't be so adamant that IELR+LAC=canonical LR.
+       That is, as the referenced section in the manual documents, LAC
+       does not fix infinite parsing loops on syntax errors.
+       * doc/bison.texinfo: Consistently drop the "(1)" suffix from LALR,
+       IELR, and LR in @cindex.
+       (%define Summary): Condense the entries for lr.default-reductions,
+       lr.keep-unreachable-states, lr.type, and parse.lac into brief
+       summaries, and cross-reference the appropriate subsections of
+       Tuning LR.  For parse.lac, mention that it's only implemented for
+       deterministic parsers in C.
+       (Error Reporting): When mentioning %error-verbose, mention LAC,
+       and add cross-reference to the LAC section.
+       (Tuning LR): New section with an extended version of the
+       documentation removed from %define Summary.  Change all
+       cross-references in the manual to point here instead of there.
+       (Calc++ Parser): When mentioning %error-verbose, mention LAC, and
+       add cross-reference to the LAC section.
+       (Table of Symbols): In %error-verbose and YYERROR_VERBOSE entries,
+       add cross-references to Error Reporting.
+       (Glossary): Capitalize entry titles consistently.  Add definitions
+       for "defaulted state" and "unreachable state".  Expand IELR
+       acronym in IELR's entry.
+
 2011-02-20  Joel E. Denny  <address@hidden>
 
        doc: add bibliography to manual.
diff --git a/NEWS b/NEWS
index 37f0b6f..a657bcc 100644
--- a/NEWS
+++ b/NEWS
@@ -57,27 +57,27 @@ Bison News
     %define lr.type ielr
     %define lr.type canonical-lr
 
-  The default reduction optimization in the parser tables can also be
-  adjusted using `%define lr.default-reductions'.  See the documentation
-  for `%define lr.type' and `%define lr.default-reductions' in the
-  section `Bison Declaration Summary' in the Bison manual for the
-  details.
+  The default-reduction optimization in the parser tables can also be
+  adjusted using `%define lr.default-reductions'.  For details on both
+  of these features, see the new section `Tuning LR' in the Bison
+  manual.
 
   These features are experimental.  More user feedback will help to
   stabilize them.
 
-** LAC (lookahead correction) for syntax error handling:
+** LAC (Lookahead Correction) for syntax error handling:
 
   Canonical LR, IELR, and LALR can suffer from a couple of problems
   upon encountering a syntax error.  First, the parser might perform
   additional parser stack reductions before discovering the syntax
-  error.  Such reductions perform user semantic actions that are
+  error.  Such reductions can perform user semantic actions that are
   unexpected because they are based on an invalid token, and they
   cause error recovery to begin in a different syntactic context than
   the one in which the invalid token was encountered.  Second, when
-  verbose error messages are enabled (with %error-verbose or `#define
-  YYERROR_VERBOSE'), the expected token list in the syntax error
-  message can both contain invalid tokens and omit valid tokens.
+  verbose error messages are enabled (with %error-verbose or the
+  obsolete `#define YYERROR_VERBOSE'), the expected token list in the
+  syntax error message can both contain invalid tokens and omit valid
+  tokens.
 
   The culprits for the above problems are %nonassoc, default
   reductions in inconsistent states, and parser state merging.  Thus,
@@ -85,11 +85,11 @@ Bison News
   %nonassoc is used or if default reductions are enabled for
   inconsistent states.
 
-  LAC is a new mechanism within the parsing algorithm that completely
-  solves these problems for canonical LR, IELR, and LALR without
-  sacrificing %nonassoc, default reductions, or state mering.  When
-  LAC is in use, canonical LR and IELR behave exactly the same for
-  both syntactically acceptable and syntactically unacceptable input.
+  LAC is a new mechanism within the parsing algorithm that solves
+  these problems for canonical LR, IELR, and LALR without sacrificing
+  %nonassoc, default reductions, or state merging.  When LAC is in
+  use, canonical LR and IELR behave almost exactly the same for both
+  syntactically acceptable and syntactically unacceptable input.
   While LALR still does not support the full language-recognition
   power of canonical LR and IELR, LAC at least enables LALR's syntax
   error handling to correctly reflect LALR's language-recognition
@@ -100,8 +100,8 @@ Bison News
 
     %define parse.lac full
 
-  See the documentation for `%define parse.lac' in the section `Bison
-  Declaration Summary' in the Bison manual for additional details.
+  See the new section `LAC' in the Bison manual for additional
+  details including a few caveats.
 
   LAC is an experimental feature.  More user feedback will help to
   stabilize it.
@@ -255,11 +255,11 @@ Bison News
 
 ** Verbose syntax error message fixes:
 
-  When %error-verbose or `#define YYERROR_VERBOSE' is specified,
-  syntax error messages produced by the generated parser include the
-  unexpected token as well as a list of expected tokens.  The effect
-  of %nonassoc on these verbose messages has been corrected in two
-  ways, but a complete fix requires LAC, described above:
+  When %error-verbose or the obsolete `#define YYERROR_VERBOSE' is
+  specified, syntax error messages produced by the generated parser
+  include the unexpected token as well as a list of expected tokens.
+  The effect of %nonassoc on these verbose messages has been corrected
+  in two ways, but a more complete fix requires LAC, described above:
 
 *** When %nonassoc is used, there can exist parser states that accept no
     tokens, and so the parser does not always require a lookahead token
diff --git a/doc/bison.texinfo b/doc/bison.texinfo
index 8898570..b226726 100644
--- a/doc/bison.texinfo
+++ b/doc/bison.texinfo
@@ -265,6 +265,7 @@ The Bison Parser Algorithm
 * Parser States::     The parser is a finite-state-machine with stack.
 * Reduce/Reduce::     When two rules are applicable in the same situation.
 * Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
+* Tuning LR::         How to tune fundamental aspects of LR-based parsing.
 * Generalized LR Parsing::  Parsing arbitrary context-free grammars.
 * Memory Management:: What happens when memory is exhausted.  How to avoid it.
 
@@ -275,6 +276,13 @@ Operator Precedence
 * Precedence Examples::  How these features are used in the previous example.
 * How Precedence::    How they work.
 
+Tuning LR
+
+* LR Table Construction:: Choose a different construction algorithm.
+* Default Reductions::    Disable default reductions.
+* LAC::                   Correct lookahead sets in the parser states.
+* Unreachable States::    Keep unreachable parser states for debugging.
+
 Handling Context Dependencies
 
 * Semantic Tokens::   Token parsing can depend on the semantic context.
@@ -471,21 +479,19 @@ order to specify the language Algol 60.  Any grammar 
expressed in
 BNF is a context-free grammar.  The input to Bison is
 essentially machine-readable BNF.
 
address@hidden LALR(1) grammars
address@hidden IELR(1) grammars
address@hidden LR(1) grammars
-There are various important subclasses of context-free grammars.
-Although it can handle almost all context-free grammars, Bison is
-optimized for what are called LR(1) grammars.
-In brief, in these grammars, it must be possible to tell how to parse
-any portion of an input string with just a single token of lookahead.
-For historical reasons, Bison by default is limited by the additional
-restrictions of LALR(1), which is hard to explain simply.
address@hidden Conflicts, ,Mysterious Reduce/Reduce Conflicts}, for
-more information on this.
-As an experimental feature, you can escape these additional restrictions by
-requesting IELR(1) or canonical LR(1) parser tables.
address@hidden Summary,,lr.type}, to learn how.
address@hidden LALR grammars
address@hidden IELR grammars
address@hidden LR grammars
+There are various important subclasses of context-free grammars.  Although
+it can handle almost all context-free grammars, Bison is optimized for what
+are called LR(1) grammars.  In brief, in these grammars, it must be possible
+to tell how to parse any portion of an input string with just a single token
+of lookahead.  For historical reasons, Bison by default is limited by the
+additional restrictions of LALR(1), which is hard to explain simply.
address@hidden Conflicts, ,Mysterious Reduce/Reduce Conflicts}, for more
+information on this.  As an experimental feature, you can escape these
+additional restrictions by requesting IELR(1) or canonical LR(1) parser
+tables.  @xref{LR Table Construction}, to learn how.
 
 @cindex GLR parsing
 @cindex generalized LR (GLR) parsing
@@ -5150,65 +5156,17 @@ More user feedback will help to stabilize it.)
 @c ================================================== lr.default-reductions
 
 @item lr.default-reductions
address@hidden default reductions
 @findex %define lr.default-reductions
address@hidden delayed syntax errors
address@hidden syntax errors delayed
address@hidden LAC
address@hidden %nonassoc
 
 @itemize @bullet
 @item Language(s): all
 
 @item Purpose: Specify the kind of states that are permitted to
-contain default reductions.
-That is, in such a state, Bison selects the reduction with the largest
-lookahead set to be the default parser action and then removes that
-lookahead set.
-(The ability to specify where default reductions should be used is
-experimental.
-More user feedback will help to stabilize it.)
-
address@hidden Accepted Values:
address@hidden
address@hidden @code{all}.
-This is the traditional Bison behavior.  The main advantage is a
-significant decrease in the size of the parser tables.  The
-disadvantage is that, when the generated parser encounters a
-syntactically unacceptable token, the parser might then perform
-unnecessary default reductions before it can detect the syntax error.
-Such delayed syntax error detection is usually inherent in LALR and
-IELR parser tables anyway due to LR state merging (@pxref{%define
-Summary,,lr.type}).  Furthermore, the use of @code{%nonassoc} can
-contribute to delayed syntax error detection even in the case of
-canonical LR.  As an experimental feature, delayed syntax error
-detection can be overcome in all cases by enabling LAC (@pxref{%define
-Summary,,parse.lac}, for details, including a discussion of the
-effects of delayed syntax error detection).
-
address@hidden @code{consistent}.
address@hidden consistent states
-A consistent state is a state that has only one possible action.
-If that action is a reduction, then the parser does not need to request
-a lookahead token from the scanner before performing that action.
-However, the parser recognizes the ability to ignore the lookahead token
-in this way only when such a reduction is encoded as a default
-reduction.
-Thus, if default reductions are permitted only in consistent states,
-then a canonical LR parser that does not employ
address@hidden detects a syntax error as soon as it @emph{needs} the
-syntactically unacceptable token from the scanner.
-
address@hidden @code{accepting}.
address@hidden accepting state
-In the accepting state, the default reduction is actually the accept
-action.
-In this case, a canonical LR parser that does not employ
address@hidden detects a syntax error as soon as it @emph{reaches} the
-syntactically unacceptable token in the input.
-That is, it does not perform any extra reductions.
address@hidden itemize
+contain default reductions.  @xref{Default Reductions}.  (The ability to
+specify where default reductions should be used is experimental.  More user
+feedback will help to stabilize it.)
 
address@hidden Accepted Values: @code{all}, @code{consistent}, @code{accepting}
 @item Default Value:
 @itemize
 @item @code{accepting} if @code{lr.type} is @code{canonical-lr}.
@@ -5223,129 +5181,25 @@ That is, it does not perform any extra reductions.
 
 @itemize @bullet
 @item Language(s): all
-
 @item Purpose: Request that Bison allow unreachable parser states to
-remain in the parser tables.
-Bison considers a state to be unreachable if there exists no sequence of
-transitions from the start state to that state.
-A state can become unreachable during conflict resolution if Bison disables a
-shift action leading to it from a predecessor state.
-Keeping unreachable states is sometimes useful for analysis purposes, but they
-are useless in the generated parser.
-
+remain in the parser tables.  @xref{Unreachable States}.
 @item Accepted Values: Boolean
-
 @item Default Value: @code{false}
-
address@hidden Caveats:
-
address@hidden @bullet
-
address@hidden Unreachable states may contain conflicts and may use rules not 
used in
-any other state.
-Thus, keeping unreachable states may induce warnings that are irrelevant to
-your parser's behavior, and it may eliminate warnings that are relevant.
-Of course, the change in warnings may actually be relevant to a parser table
-analysis that wants to keep unreachable states, so this behavior will likely
-remain in future Bison releases.
-
address@hidden While Bison is able to remove unreachable states, it is not 
guaranteed to
-remove other kinds of useless states.
-Specifically, when Bison disables reduce actions during conflict resolution,
-some goto actions may become useless, and thus some additional states may
-become useless.
-If Bison were to compute which goto actions were useless and then disable those
-actions, it could identify such states as unreachable and then remove those
-states.
-However, Bison does not compute which goto actions are useless.
address@hidden itemize
 @end itemize
 
 @c ================================================== lr.type
 
 @item lr.type
 @findex %define lr.type
address@hidden LALR
address@hidden IELR
address@hidden LR
 
 @itemize @bullet
 @item Language(s): all
 
 @item Purpose: Specify the type of parser tables within the
-LR(1) family.
-(This feature is experimental.
+LR(1) family.  @xref{LR Table Construction}.  (This feature is experimental.
 More user feedback will help to stabilize it.)
 
address@hidden Accepted Values:
address@hidden
address@hidden @code{lalr}.
-While Bison generates LALR parser tables by default for
-historical reasons, IELR or canonical LR is almost
-always preferable for deterministic parsers.
-The trouble is that LALR parser tables can suffer from
-mysterious conflicts and thus may not accept the full set of sentences
-that IELR and canonical LR accept.
address@hidden Conflicts}, for details.
-However, there are at least two scenarios where LALR may be
-worthwhile:
address@hidden
address@hidden GLR with LALR
address@hidden When employing GLR parsers (@pxref{GLR Parsers}), if you
-do not resolve any conflicts statically (for example, with @code{%left}
-or @code{%prec}), then the parser explores all potential parses of any
-given input.
-In this case, the use of LALR parser tables is guaranteed not
-to alter the language accepted by the parser.
-LALR parser tables are the smallest parser tables Bison can
-currently generate, so they may be preferable.
-Nevertheless, once you begin to resolve conflicts statically,
-GLR begins to behave more like a deterministic parser, and so
-IELR and canonical LR can be helpful to avoid
-LALR's mysterious behavior.
-
address@hidden Occasionally during development, an especially malformed grammar
-with a major recurring flaw may severely impede the IELR or
-canonical LR parser table generation algorithm.
-LALR can be a quick way to generate parser tables in order to
-investigate such problems while ignoring the more subtle differences
-from IELR and canonical LR.
address@hidden itemize
-
address@hidden @code{ielr}.
-IELR is a minimal LR algorithm.
-That is, given any grammar (LR or non-LR),
-IELR and canonical LR always accept exactly the same
-set of sentences.
-However, as for LALR, the number of parser states is often an
-order of magnitude less for IELR than for canonical
-LR.
-More importantly, because canonical LR's extra parser states
-may contain duplicate conflicts in the case of non-LR
-grammars, the number of conflicts for IELR is often an order
-of magnitude less as well.
-This can significantly reduce the complexity of developing of a grammar.
-
address@hidden @code{canonical-lr}.
address@hidden delayed syntax errors
address@hidden syntax errors delayed
address@hidden LAC
address@hidden %nonassoc
-While inefficient, canonical LR parser tables can be an interesting
-means to explore a grammar because they have a property that IELR and
-LALR tables do not.  That is, if @code{%nonassoc} is not used and
-default reductions are left disabled (@pxref{%define
-Summary,,lr.default-reductions}), then, for every left context of
-every canonical LR state, the set of tokens accepted by that state is
-guaranteed to be the exact set of tokens that is syntactically
-acceptable in that left context.  It might then seem that an advantage
-of canonical LR parsers in production is that, under the above
-constraints, they are guaranteed to detect a syntax error as soon as
-possible without performing any unnecessary reductions.  However, IELR
-parsers using LAC (@pxref{%define Summary,,parse.lac}) are also able
-to achieve this behavior without sacrificing @code{%nonassoc} or
-default reductions.
address@hidden itemize
address@hidden Accepted Values: @code{lalr}, @code{ielr}, @code{canonical-lr}
 
 @item Default Value: @code{lalr}
 @end itemize
@@ -5405,84 +5259,13 @@ The parser namespace is @code{foo} and @code{yylex} is 
referenced as
 @c ================================================== parse.lac
 @item parse.lac
 @findex %define parse.lac
address@hidden LAC
address@hidden lookahead correction
 
 @itemize
address@hidden Languages(s): C
address@hidden Languages(s): C (deterministic parsers only)
 
 @item Purpose: Enable LAC (lookahead correction) to improve
-syntax error handling.
-
-Canonical LR, IELR, and LALR can suffer
-from a couple of problems upon encountering a syntax error.  First, the
-parser might perform additional parser stack reductions before
-discovering the syntax error.  Such reductions perform user semantic
-actions that are unexpected because they are based on an invalid token,
-and they cause error recovery to begin in a different syntactic context
-than the one in which the invalid token was encountered.  Second, when
-verbose error messages are enabled (with @code{%error-verbose} or
address@hidden YYERROR_VERBOSE}), the expected token list in the syntax
-error message can both contain invalid tokens and omit valid tokens.
-
-The culprits for the above problems are @code{%nonassoc}, default
-reductions in inconsistent states, and parser state merging.  Thus,
-IELR and LALR suffer the most.  Canonical
-LR can suffer only if @code{%nonassoc} is used or if default
-reductions are enabled for inconsistent states.
-
-LAC is a new mechanism within the parsing algorithm that
-completely solves these problems for canonical LR,
-IELR, and LALR without sacrificing @code{%nonassoc},
-default reductions, or state mering.  Conceptually, the mechanism is
-straight-forward.  Whenever the parser fetches a new token from the
-scanner so that it can determine the next parser action, it immediately
-suspends normal parsing and performs an exploratory parse using a
-temporary copy of the normal parser state stack.  During this
-exploratory parse, the parser does not perform user semantic actions.
-If the exploratory parse reaches a shift action, normal parsing then
-resumes on the normal parser stacks.  If the exploratory parse reaches
-an error instead, the parser reports a syntax error.  If verbose syntax
-error messages are enabled, the parser must then discover the list of
-expected tokens, so it performs a separate exploratory parse for each
-token in the grammar.
-
-There is one subtlety about the use of LAC.  That is, when in a
-consistent parser state with a default reduction, the parser will not
-attempt to fetch a token from the scanner because no lookahead is
-needed to determine the next parser action.  Thus, whether default
-reductions are enabled in consistent states (@pxref{%define
-Summary,,lr.default-reductions}) affects how soon the parser detects a
-syntax error: when it @emph{reaches} an erroneous token or when it
-eventually @emph{needs} that token as a lookahead.  The latter
-behavior is probably more intuitive, so Bison currently provides no
-way to achieve the former behavior while default reductions are fully
-enabled.
-
-Thus, when LAC is in use, for some fixed decision of whether
-to enable default reductions in consistent states, canonical
-LR and IELR behave exactly the same for both
-syntactically acceptable and syntactically unacceptable input.  While
-LALR still does not support the full language-recognition
-power of canonical LR and IELR, LAC at
-least enables LALR's syntax error handling to correctly
-reflect LALR's language-recognition power.
-
-Because LAC requires many parse actions to be performed twice,
-it can have a performance penalty.  However, not all parse actions must
-be performed twice.  Specifically, during a series of default reductions
-in consistent states and shift actions, the parser never has to initiate
-an exploratory parse.  Moreover, the most time-consuming tasks in a
-parse are often the file I/O, the lexical analysis performed by the
-scanner, and the user's semantic actions, but none of these are
-performed during the exploratory parse.  Finally, the base of the
-temporary stack used during an exploratory parse is a pointer into the
-normal parser state stack so that the stack is never physically copied.
-In our experience, the performance penalty of LAC has proven
-insignificant for practical grammars.
-
+syntax error handling.  @xref{LAC}.
 @item Accepted Values: @code{none}, @code{full}
-
 @item Default Value: @code{none}
 @end itemize
 @end itemize
@@ -6075,10 +5858,11 @@ receives one argument.  For a syntax error, the string 
is normally
 @address@hidden"syntax error"}}.
 
 @findex %error-verbose
-If you invoke the directive @code{%error-verbose} in the Bison
-declarations section (@pxref{Bison Declarations, ,The Bison Declarations
-Section}), then Bison provides a more verbose and specific error message
-string instead of just plain @address@hidden"syntax error"}}.
+If you invoke the directive @code{%error-verbose} in the Bison declarations
+section (@pxref{Bison Declarations, ,The Bison Declarations Section}), then
+Bison provides a more verbose and specific error message string instead of
+just plain @address@hidden"syntax error"}}.  However, that message sometimes
+contains incorrect information if LAC is not enabled (@pxref{LAC}).
 
 The parser can detect one other kind of error: memory exhaustion.  This
 can happen when the input contains constructions that are very deeply
@@ -6479,6 +6263,7 @@ This kind of parser is known in the literature as a 
bottom-up parser.
 * Parser States::     The parser is a finite-state-machine with stack.
 * Reduce/Reduce::     When two rules are applicable in the same situation.
 * Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
+* Tuning LR::         How to tune fundamental aspects of LR-based parsing.
 * Generalized LR Parsing::  Parsing arbitrary context-free grammars.
 * Memory Management:: What happens when memory is exhausted.  How to avoid it.
 @end menu
@@ -6996,6 +6781,7 @@ redirects:redirect
 
 @node Mystery Conflicts
 @section Mysterious Reduce/Reduce Conflicts
address@hidden Mysterious Conflicts
 
 Sometimes reduce/reduce conflicts can occur that don't look warranted.
 Here is an example:
@@ -7037,8 +6823,8 @@ of lookahead: when a @code{param_spec} is being read, an 
@code{ID} is
 a @code{name} if a comma or colon follows, or a @code{type} if another
 @code{ID} follows.  In other words, this grammar is LR(1).
 
address@hidden LR(1)
address@hidden LALR(1)
address@hidden LR
address@hidden LALR
 However, for historical reasons, Bison cannot by default handle all
 LR(1) grammars.
 In this grammar, two contexts, that after an @code{ID} at the beginning
@@ -7053,15 +6839,16 @@ contexts, so it makes a single parser state for them 
both.  Combining
 the two contexts causes a conflict later.  In parser terminology, this
 occurrence means that the grammar is not LALR(1).
 
-For many practical grammars (specifically those that fall into the
-non-LR(1) class), the limitations of LALR(1) result in difficulties
-beyond just mysterious reduce/reduce conflicts.  The best way to fix
-all these problems is to select a different parser table generation
-algorithm.  Either IELR(1) or canonical LR(1) would suffice, but the
-former is more efficient and easier to debug during development.
address@hidden Summary,,lr.type}, for details.  (Bison's IELR(1) and
-canonical LR(1) implementations are experimental.  More user feedback
-will help to stabilize them.)
address@hidden IELR
address@hidden canonical LR
+For many practical grammars (specifically those that fall into the non-LR(1)
+class), the limitations of LALR(1) result in difficulties beyond just
+mysterious reduce/reduce conflicts.  The best way to fix all these problems
+is to select a different parser table construction algorithm.  Either
+IELR(1) or canonical LR(1) would suffice, but the former is more efficient
+and easier to debug during development.  @xref{LR Table Construction}, for
+details.  (Bison's IELR(1) and canonical LR(1) implementations are
+experimental.  More user feedback will help to stabilize them.)
 
 If you instead wish to work around LALR(1)'s limitations, you
 can often fix a mysterious conflict by identifying the two parser states
@@ -7112,6 +6899,409 @@ return_spec:
 For a more detailed exposition of LALR(1) parsers and parser
 generators, @pxref{Bibliography,,DeRemer 1982}.
 
address@hidden Tuning LR
address@hidden Tuning LR
+
+The default behavior of Bison's LR-based parsers is chosen mostly for
+historical reasons, but that behavior is often not robust.  For example, in
+the previous section, we discussed the mysterious conflicts that can be
+produced by LALR(1), Bison's default parser table construction algorithm.
+Another example is Bison's @code{%error-verbose} directive, which instructs
+the generated parser to produce verbose syntax error messages, which can
+sometimes contain incorrect information.
+
+In this section, we explore several modern features of Bison that allow you
+to tune fundamental aspects of the generated LR-based parsers.  Some of
+these features easily eliminate shortcomings like those mentioned above.
+Others can be helpful purely for understanding your parser.
+
+Most of the features discussed in this section are still experimental.  More
+user feedback will help to stabilize them.
+
address@hidden
+* LR Table Construction:: Choose a different construction algorithm.
+* Default Reductions::    Disable default reductions.
+* LAC::                   Correct lookahead sets in the parser states.
+* Unreachable States::    Keep unreachable parser states for debugging.
address@hidden menu
+
address@hidden LR Table Construction
address@hidden LR Table Construction
address@hidden Mysterious Conflict
address@hidden LALR
address@hidden IELR
address@hidden canonical LR
address@hidden %define lr.type
+
+For historical reasons, Bison constructs LALR(1) parser tables by default.
+However, LALR does not possess the full language-recognition power of LR.
+As a result, the behavior of parsers employing LALR parser tables is often
+mysterious.  We presented a simple example of this effect in @ref{Mystery
+Conflicts}.
+
+As we also demonstrated in that example, the traditional approach to
+eliminating such mysterious behavior is to restructure the grammar.
+Unfortunately, doing so correctly is often difficult.  Moreover, merely
+discovering that LALR causes mysterious behavior in your parser can be
+difficult as well.
+
+Fortunately, Bison provides an easy way to eliminate the possibility of such
+mysterious behavior altogether.  You simply need to activate a more powerful
+parser table construction algorithm by using the @code{%define lr.type}
+directive.
+
address@hidden {Directive} {%define lr.type @var{TYPE}}
+Specify the type of parser tables within the LR(1) family.  The accepted
+values for @var{TYPE} are:
+
address@hidden
address@hidden @code{lalr} (default)
address@hidden @code{ielr}
address@hidden @code{canonical-lr}
address@hidden itemize
+
+(This feature is experimental. More user feedback will help to stabilize
+it.)
address@hidden deffn
+
+For example, to activate IELR, you might add the following directive to you
+grammar file:
+
address@hidden
+%define lr.type ielr
address@hidden example
+
address@hidden For the example in @ref{Mystery Conflicts}, the mysterious
+conflict is then eliminated, so there is no need to invest time in
+comprehending the conflict or restructuring the grammar to fix it.  If,
+during future development, the grammar evolves such that all mysterious
+behavior would have disappeared using just LALR, you need not fear that
+continuing to use IELR will result in unnecessarily large parser tables.
+That is, IELR generates LALR tables when LALR (using a deterministic parsing
+algorithm) is sufficient to support the full language-recognition power of
+LR.  Thus, by enabling IELR at the start of grammar development, you can
+safely and completely eliminate the need to consider LALR's shortcomings.
+
+While IELR is almost always preferable, there are circumstances where LALR
+or the canonical LR parser tables described by Knuth
+(@pxref{Bibliography,,Knuth 1965}) can be useful.  Here we summarize the
+relative advantages of each parser table construction algorithm within
+Bison:
+
address@hidden
address@hidden LALR
+
+There are at least two scenarios where LALR can be worthwhile:
+
address@hidden
address@hidden GLR without static conflict resolution.
+
address@hidden GLR with LALR
+When employing GLR parsers (@pxref{GLR Parsers}), if you do not resolve any
+conflicts statically (for example, with @code{%left} or @code{%prec}), then
+the parser explores all potential parses of any given input.  In this case,
+the choice of parser table construction algorithm is guaranteed not to alter
+the language accepted by the parser.  LALR parser tables are the smallest
+parser tables Bison can currently construct, so they may then be preferable.
+Nevertheless, once you begin to resolve conflicts statically, GLR behaves
+more like a deterministic parser in the syntactic contexts where those
+conflicts appear, and so either IELR or canonical LR can then be helpful to
+avoid LALR's mysterious behavior.
+
address@hidden Malformed grammars.
+
+Occasionally during development, an especially malformed grammar with a
+major recurring flaw may severely impede the IELR or canonical LR parser
+table construction algorithm.  LALR can be a quick way to construct parser
+tables in order to investigate such problems while ignoring the more subtle
+differences from IELR and canonical LR.
address@hidden itemize
+
address@hidden IELR
+
+IELR (Inadequacy Elimination LR) is a minimal LR algorithm.  That is, given
+any grammar (LR or non-LR), parsers using IELR or canonical LR parser tables
+always accept exactly the same set of sentences.  However, like LALR, IELR
+merges parser states during parser table construction so that the number of
+parser states is often an order of magnitude less than for canonical LR.
+More importantly, because canonical LR's extra parser states may contain
+duplicate conflicts in the case of non-LR grammars, the number of conflicts
+for IELR is often an order of magnitude less as well.  This effect can
+significantly reduce the complexity of developing a grammar.
+
address@hidden Canonical LR
+
address@hidden delayed syntax error detection
address@hidden LAC
address@hidden %nonassoc
+While inefficient, canonical LR parser tables can be an interesting means to
+explore a grammar because they possess a property that IELR and LALR tables
+do not.  That is, if @code{%nonassoc} is not used and default reductions are
+left disabled (@pxref{Default Reductions}), then, for every left context of
+every canonical LR state, the set of tokens accepted by that state is
+guaranteed to be the exact set of tokens that is syntactically acceptable in
+that left context.  It might then seem that an advantage of canonical LR
+parsers in production is that, under the above constraints, they are
+guaranteed to detect a syntax error as soon as possible without performing
+any unnecessary reductions.  However, IELR parsers that use LAC are also
+able to achieve this behavior without sacrificing @code{%nonassoc} or
+default reductions.  For details and a few caveats of LAC, @pxref{LAC}.
address@hidden itemize
+
+For a more detailed exposition of the mysterious behavior in LALR parsers
+and the benefits of IELR, @pxref{Bibliography,,Denny 2008 March}, and
address@hidden,,Denny 2010 November}.
+
address@hidden Default Reductions
address@hidden Default Reductions
address@hidden default reductions
address@hidden %define lr.default-reductions
address@hidden %nonassoc
+
+After parser table construction, Bison identifies the reduction with the
+largest lookahead set in each parser state.  To reduce the size of the
+parser state, traditional Bison behavior is to remove that lookahead set and
+to assign that reduction to be the default parser action.  Such a reduction
+is known as a @dfn{default reduction}.
+
+Default reductions affect more than the size of the parser tables.  They
+also affect the behavior of the parser:
+
address@hidden
address@hidden Delayed @code{yylex} invocations.
+
address@hidden delayed yylex invocations
address@hidden consistent states
address@hidden defaulted states
+A @dfn{consistent state} is a state that has only one possible parser
+action.  If that action is a reduction and is encoded as a default
+reduction, then that consistent state is called a @dfn{defaulted state}.
+Upon reaching a defaulted state, a Bison-generated parser does not bother to
+invoke @code{yylex} to fetch the next token before performing the reduction.
+In other words, whether default reductions are enabled in consistent states
+determines how soon a Bison-generated parser invokes @code{yylex} for a
+token: immediately when it @emph{reaches} that token in the input or when it
+eventually @emph{needs} that token as a lookahead to determine the next
+parser action.  Traditionally, default reductions are enabled, and so the
+parser exhibits the latter behavior.
+
+The presence of defaulted states is an important consideration when
+designing @code{yylex} and the grammar file.  That is, if the behavior of
address@hidden can influence or be influenced by the semantic actions
+associated with the reductions in defaulted states, then the delay of the
+next @code{yylex} invocation until after those reductions is significant.
+For example, the semantic actions might pop a scope stack that @code{yylex}
+uses to determine what token to return.  Thus, the delay might be necessary
+to ensure that @code{yylex} does not look up the next token in a scope that
+should already be considered closed.
+
address@hidden Delayed syntax error detection.
+
address@hidden delayed syntax error detection
+When the parser fetches a new token by invoking @code{yylex}, it checks
+whether there is an action for that token in the current parser state.  The
+parser detects a syntax error if and only if either (1) there is no action
+for that token or (2) the action for that token is the error action (due to
+the use of @code{%nonassoc}).  However, if there is a default reduction in
+that state (which might or might not be a defaulted state), then it is
+impossible for condition 1 to exist.  That is, all tokens have an action.
+Thus, the parser sometimes fails to detect the syntax error until it reaches
+a later state.
+
address@hidden LAC
address@hidden If there's an infinite loop, default reductions can prevent an 
incorrect
address@hidden sentence from being rejected.
+While default reductions never cause the parser to accept syntactically
+incorrect sentences, the delay of syntax error detection can have unexpected
+effects on the behavior of the parser.  However, the delay can be caused
+anyway by parser state merging and the use of @code{%nonassoc}, and it can
+be fixed by another Bison feature, LAC.  We discuss the effects of delayed
+syntax error detection and LAC more in the next section (@pxref{LAC}).
address@hidden itemize
+
+For canonical LR, the only default reduction that Bison enables by default
+is the accept action, which appears only in the accepting state, which has
+no other action and is thus a defaulted state.  However, the default accept
+action does not delay any @code{yylex} invocation or syntax error detection
+because the accept action ends the parse.
+
+For LALR and IELR, Bison enables default reductions in nearly all states by
+default.  There are only two exceptions.  First, states that have a shift
+action on the @code{error} token do not have default reductions because
+delayed syntax error detection could then prevent the @code{error} token
+from ever being shifted in that state.  However, parser state merging can
+cause the same effect anyway, and LAC fixes it in both cases, so future
+versions of Bison might drop this exception when LAC is activated.  Second,
+GLR parsers do not record the default reduction as the action on a lookahead
+token for which there is a conflict.  The correct action in this case is to
+split the parse instead.
+
+To adjust which states have default reductions enabled, use the
address@hidden lr.default-reductions} directive.
+
address@hidden {Directive} {%define lr.default-reductions @var{WHERE}}
+Specify the kind of states that are permitted to contain default reductions.
+The accepted values of @var{WHERE} are:
address@hidden
address@hidden @code{all} (default for LALR and IELR)
address@hidden @code{consistent}
address@hidden @code{accepting} (default for canonical LR)
address@hidden itemize
+
+(The ability to specify where default reductions are permitted is
+experimental.  More user feedback will help to stabilize it.)
address@hidden deffn
+
+FIXME: Because of the exceptions described above, @code{all} is a misnomer.
+Rename to @code{full}.
+
address@hidden LAC
address@hidden LAC
address@hidden %define parse.lac
address@hidden LAC
address@hidden lookahead correction
+
+Canonical LR, IELR, and LALR can suffer from a couple of problems upon
+encountering a syntax error.  First, the parser might perform additional
+parser stack reductions before discovering the syntax error.  Such
+reductions can perform user semantic actions that are unexpected because
+they are based on an invalid token, and they cause error recovery to begin
+in a different syntactic context than the one in which the invalid token was
+encountered.  Second, when verbose error messages are enabled (@pxref{Error
+Reporting}), the expected token list in the syntax error message can both
+contain invalid tokens and omit valid tokens.
+
+The culprits for the above problems are @code{%nonassoc}, default reductions
+in inconsistent states (@pxref{Default Reductions}), and parser state
+merging.  Because IELR and LALR merge parser states, they suffer the most.
+Canonical LR can suffer only if @code{%nonassoc} is used or if default
+reductions are enabled for inconsistent states.
+
+LAC (Lookahead Correction) is a new mechanism within the parsing algorithm
+that solves these problems for canonical LR, IELR, and LALR without
+sacrificing @code{%nonassoc}, default reductions, or state merging.  You can
+enable LAC with the @code{%define parse.lac} directive.
+
address@hidden {Directive} {%define parse.lac @var{VALUE}}
+Enable LAC to improve syntax error handling.
address@hidden
address@hidden @code{none} (default)
address@hidden @code{full}
address@hidden itemize
+(This feature is experimental.  More user feedback will help to stabilize
+it.  Moreover, it is currently only available for deterministic parsers in
+C.)
address@hidden deffn
+
+Conceptually, the LAC mechanism is straight-forward.  Whenever the parser
+fetches a new token from the scanner so that it can determine the next
+parser action, it immediately suspends normal parsing and performs an
+exploratory parse using a temporary copy of the normal parser state stack.
+During this exploratory parse, the parser does not perform user semantic
+actions.  If the exploratory parse reaches a shift action, normal parsing
+then resumes on the normal parser stacks.  If the exploratory parse reaches
+an error instead, the parser reports a syntax error.  If verbose syntax
+error messages are enabled, the parser must then discover the list of
+expected tokens, so it performs a separate exploratory parse for each token
+in the grammar.
+
+There is one subtlety about the use of LAC.  That is, when in a consistent
+parser state with a default reduction, the parser will not attempt to fetch
+a token from the scanner because no lookahead is needed to determine the
+next parser action.  Thus, whether default reductions are enabled in
+consistent states (@pxref{Default Reductions}) affects how soon the parser
+detects a syntax error: immediately when it @emph{reaches} an erroneous
+token or when it eventually @emph{needs} that token as a lookahead to
+determine the next parser action.  The latter behavior is probably more
+intuitive, so Bison currently provides no way to achieve the former behavior
+while default reductions are enabled in consistent states.
+
+Thus, when LAC is in use, for some fixed decision of whether to enable
+default reductions in consistent states, canonical LR and IELR behave almost
+exactly the same for both syntactically acceptable and syntactically
+unacceptable input.  While LALR still does not support the full
+language-recognition power of canonical LR and IELR, LAC at least enables
+LALR's syntax error handling to correctly reflect LALR's
+language-recognition power.
+
+There are a few caveats to consider when using LAC:
+
address@hidden
address@hidden Infinite parsing loops.
+
+IELR plus LAC does have one shortcoming relative to canonical LR.  Some
+parsers generated by Bison can loop infinitely.  LAC does not fix infinite
+parsing loops that occur between encountering a syntax error and detecting
+it, but enabling canonical LR or disabling default reductions sometimes
+does.
+
address@hidden Verbose error message limitations.
+
+Because of internationalization considerations, Bison-generated parsers
+limit the size of the expected token list they are willing to report in a
+verbose syntax error message.  If the number of expected tokens exceeds that
+limit, the list is simply dropped from the message.  Enabling LAC can
+increase the size of the list and thus cause the parser to drop it.  Of
+course, dropping the list is better than reporting an incorrect list.
+
address@hidden Performance.
+
+Because LAC requires many parse actions to be performed twice, it can have a
+performance penalty.  However, not all parse actions must be performed
+twice.  Specifically, during a series of default reductions in consistent
+states and shift actions, the parser never has to initiate an exploratory
+parse.  Moreover, the most time-consuming tasks in a parse are often the
+file I/O, the lexical analysis performed by the scanner, and the user's
+semantic actions, but none of these are performed during the exploratory
+parse.  Finally, the base of the temporary stack used during an exploratory
+parse is a pointer into the normal parser state stack so that the stack is
+never physically copied.  In our experience, the performance penalty of LAC
+has proven insignificant for practical grammars.
address@hidden itemize
+
address@hidden Unreachable States
address@hidden Unreachable States
address@hidden %define lr.keep-unreachable-states
address@hidden unreachable states
+
+If there exists no sequence of transitions from the parser's start state to
+some state @var{s}, then Bison considers @var{s} to be an @dfn{unreachable
+state}.  A state can become unreachable during conflict resolution if Bison
+disables a shift action leading to it from a predecessor state.
+
+By default, Bison removes unreachable states from the parser after conflict
+resolution because they are useless in the generated parser.  However,
+keeping unreachable states is sometimes useful when trying to understand the
+relationship between the parser and the grammar.
+
address@hidden {Directive} {%define lr.keep-unreachable-states @var{VALUE}}
+Request that Bison allow unreachable states to remain in the parser tables.
address@hidden must be a Boolean.  The default is @code{false}.
address@hidden deffn
+
+There are a few caveats to consider:
+
address@hidden @bullet
address@hidden Missing or extraneous warnings.
+
+Unreachable states may contain conflicts and may use rules not used in any
+other state.  Thus, keeping unreachable states may induce warnings that are
+irrelevant to your parser's behavior, and it may eliminate warnings that are
+relevant.  Of course, the change in warnings may actually be relevant to a
+parser table analysis that wants to keep unreachable states, so this
+behavior will likely remain in future Bison releases.
+
address@hidden Other useless states.
+
+While Bison is able to remove unreachable states, it is not guaranteed to
+remove other kinds of useless states.  Specifically, when Bison disables
+reduce actions during conflict resolution, some goto actions may become
+useless, and thus some additional states may become useless.  If Bison were
+to compute which goto actions were useless and then disable those actions,
+it could identify such states as unreachable and then remove those states.
+However, Bison does not compute which goto actions are useless.
address@hidden itemize
+
 @node Generalized LR Parsing
 @section Generalized LR (GLR) Parsing
 @cindex GLR parsing
@@ -8934,8 +9124,9 @@ automatically propagated.
 @end example
 
 @noindent
-Use the two following directives to enable parser tracing and verbose
-error messages.
+Use the two following directives to enable parser tracing and verbose error
+messages.  However, verbose error messages can contain incorrect information
+(@pxref{LAC}).
 
 @comment file: calc++-parser.yy
 @example
@@ -10267,9 +10458,9 @@ Precedence}.
 @end deffn
 @end ifset
 
address@hidden {Directive} %define @var{define-variable}
address@hidden {Directive} %define @var{define-variable} @var{value}
address@hidden {Directive} %define @var{define-variable} "@var{value}"
address@hidden {Directive} %define @var{variable}
address@hidden {Directive} %define @var{variable} @var{value}
address@hidden {Directive} %define @var{variable} "@var{value}"
 Define a variable to adjust Bison's behavior.  @xref{%define Summary}.
 @end deffn
 
@@ -10312,7 +10503,7 @@ token is reset to the token that originally caused the 
violation.
 
 @deffn {Directive} %error-verbose
 Bison declaration to request verbose, specific error message strings
-when @code{yyerror} is called.
+when @code{yyerror} is called.  @xref{Error Reporting}.
 @end deffn
 
 @deffn {Directive} %file-prefix "@var{prefix}"
@@ -10515,7 +10706,7 @@ An obsolete macro that you define with @code{#define} 
in the prologue
 to request verbose, specific error message strings
 when @code{yyerror} is called.  It doesn't matter what definition you
 use for @code{YYERROR_VERBOSE}, just whether you define it.  Using
address@hidden is preferred.
address@hidden is preferred.  @xref{Error Reporting}.
 @end deffn
 
 @deffn {Macro} YYINITDEPTH
@@ -10655,7 +10846,7 @@ Data type of semantic values; @code{int} by default.
 @cindex glossary
 
 @table @asis
address@hidden Accepting State
address@hidden Accepting state
 A state whose only action is the accept action.
 The accepting state is thus a consistent state.
 @xref{Understanding,,}.
@@ -10666,9 +10857,8 @@ by John Backus, and slightly improved by Peter Naur in 
his 1960-01-02
 committee document contributing to what became the Algol 60 report.
 @xref{Language and Grammar, ,Languages and Context-Free Grammars}.
 
address@hidden Consistent State
-A state containing only one possible action.  @xref{%define
-Summary,,lr.default-reductions}.
address@hidden Consistent state
+A state containing only one possible action.  @xref{Default Reductions}.
 
 @item Context-free grammars
 Grammars specified as rules that can be applied regardless of context.
@@ -10677,12 +10867,15 @@ expression, integers are allowed @emph{anywhere} an 
expression is
 permitted.  @xref{Language and Grammar, ,Languages and Context-Free
 Grammars}.
 
address@hidden Default Reduction
address@hidden Default reduction
 The reduction that a parser should perform if the current parser state
 contains no other action for the lookahead token.  In permitted parser
-states, Bison declares the reduction with the largest lookahead set to
-be the default reduction and removes that lookahead set.
address@hidden Summary,,lr.default-reductions}.
+states, Bison declares the reduction with the largest lookahead set to be
+the default reduction and removes that lookahead set.  @xref{Default
+Reductions}.
+
address@hidden Defaulted state
+A consistent state with a default reduction.  @xref{Default Reductions}.
 
 @item Dynamic allocation
 Allocation of memory that occurs during execution, rather than at
@@ -10714,17 +10907,16 @@ A language construct that is (in general) 
grammatically divisible;
 for example, `expression' or `declaration' in address@hidden
 @xref{Language and Grammar, ,Languages and Context-Free Grammars}.
 
address@hidden IELR(1)
-A minimal LR(1) parser table generation algorithm.  That is, given any
address@hidden IELR(1) (Inadequacy Elimination LR(1))
+A minimal LR(1) parser table construction algorithm.  That is, given any
 context-free grammar, IELR(1) generates parser tables with the full
-language recognition power of canonical LR(1) but with nearly the same
-number of parser states as LALR(1).  This reduction in parser states
-is often an order of magnitude.  More importantly, because canonical
-LR(1)'s extra parser states may contain duplicate conflicts in the
-case of non-LR(1) grammars, the number of conflicts for IELR(1) is
-often an order of magnitude less as well.  This can significantly
-reduce the complexity of developing of a grammar.  @xref{%define
-Summary,,lr.type}.
+language-recognition power of canonical LR(1) but with nearly the same
+number of parser states as LALR(1).  This reduction in parser states is
+often an order of magnitude.  More importantly, because canonical LR(1)'s
+extra parser states may contain duplicate conflicts in the case of non-LR(1)
+grammars, the number of conflicts for IELR(1) is often an order of magnitude
+less as well.  This can significantly reduce the complexity of developing a
+grammar.  @xref{LR Table Construction}.
 
 @item Infix operator
 An arithmetic operator that is placed between the operands on which it
@@ -10735,12 +10927,11 @@ A continuous flow of data between devices or programs.
 
 @item LAC (Lookahead Correction)
 A parsing mechanism that fixes the problem of delayed syntax error
-detection, which is caused by LR state merging, default reductions,
-and the use of @code{%nonassoc}.  Delayed syntax error detection
-results in unexpected semantic actions, initiation of error recovery
-in the wrong syntactic context, and an incorrect list of expected
-tokens in a verbose syntax error message.  @xref{%define
-Summary,,parse.lac}.
+detection, which is caused by LR state merging, default reductions, and the
+use of @code{%nonassoc}.  Delayed syntax error detection results in
+unexpected semantic actions, initiation of error recovery in the wrong
+syntactic context, and an incorrect list of expected tokens in a verbose
+syntax error message.  @xref{LAC}.
 
 @item Language construct
 One of the typical usage schemas of the language.  For example, one of
@@ -10856,6 +11047,11 @@ the lexical analyzer.  @xref{Symbols}.
 A grammar symbol that has no rules in the grammar and therefore is
 grammatically indivisible.  The piece of text it represents is a token.
 @xref{Language and Grammar, ,Languages and Context-Free Grammars}.
+
address@hidden Unreachable state
+A parser state to which there does not exist a sequence of transitions from
+the parser's start state.  A state can become unreachable during conflict
+resolution.  @xref{Unreachable States}.
 @end table
 
 @node Copying This Manual
-- 
1.7.0.4


>From d815ec4a6290e18fac9220438622f6dd27b3227f Mon Sep 17 00:00:00 2001
From: Joel E. Denny <address@hidden>
Date: Sun, 6 Mar 2011 12:46:27 -0500
Subject: [PATCH 2/4] lr.default-reductions: rename "all" value to "full".

States that shift the error token do not have default reductions,
and GLR disables some default reductions, so "all" was a misnomer.
* doc/bison.texinfo (%define Summary): Update.
(Default Reductions): Update.
* src/print.c (print_reductions): Update.
* src/reader.c (prepare_percent_define_front_end_variables):
Update.
* src/tables.c (action_row): Update.
* tests/input.at (%define enum variables): Update.
* tests/reduce.at (%define lr.default-reductions): Update.
---
 ChangeLog         |   14 ++++++++++++++
 doc/bison.texinfo |    9 +++------
 src/print.c       |    2 +-
 src/reader.c      |    4 ++--
 src/tables.c      |    2 +-
 tests/input.at    |    2 +-
 tests/reduce.at   |   13 +++++++------
 7 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index dacfddc..670853a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,19 @@
 2011-03-06  Joel E. Denny  <address@hidden>
 
+       lr.default-reductions: rename "all" value to "full".
+       States that shift the error token do not have default reductions,
+       and GLR disables some default reductions, so "all" was a misnomer.
+       * doc/bison.texinfo (%define Summary): Update.
+       (Default Reductions): Update.
+       * src/print.c (print_reductions): Update.
+       * src/reader.c (prepare_percent_define_front_end_variables):
+       Update.
+       * src/tables.c (action_row): Update.
+       * tests/input.at (%define enum variables): Update.
+       * tests/reduce.at (%define lr.default-reductions): Update.
+
+2011-03-06  Joel E. Denny  <address@hidden>
+
        doc: create a new Tuning LR section in the manual.
        And clean up all other documentation of the features described
        there.
diff --git a/doc/bison.texinfo b/doc/bison.texinfo
index b226726..8d1ba68 100644
--- a/doc/bison.texinfo
+++ b/doc/bison.texinfo
@@ -5166,11 +5166,11 @@ contain default reductions.  @xref{Default Reductions}. 
 (The ability to
 specify where default reductions should be used is experimental.  More user
 feedback will help to stabilize it.)
 
address@hidden Accepted Values: @code{all}, @code{consistent}, @code{accepting}
address@hidden Accepted Values: @code{full}, @code{consistent}, @code{accepting}
 @item Default Value:
 @itemize
 @item @code{accepting} if @code{lr.type} is @code{canonical-lr}.
address@hidden @code{all} otherwise.
address@hidden @code{full} otherwise.
 @end itemize
 @end itemize
 
@@ -7143,7 +7143,7 @@ To adjust which states have default reductions enabled, 
use the
 Specify the kind of states that are permitted to contain default reductions.
 The accepted values of @var{WHERE} are:
 @itemize
address@hidden @code{all} (default for LALR and IELR)
address@hidden @code{full} (default for LALR and IELR)
 @item @code{consistent}
 @item @code{accepting} (default for canonical LR)
 @end itemize
@@ -7152,9 +7152,6 @@ The accepted values of @var{WHERE} are:
 experimental.  More user feedback will help to stabilize it.)
 @end deffn
 
-FIXME: Because of the exceptions described above, @code{all} is a misnomer.
-Rename to @code{full}.
-
 @node LAC
 @subsection LAC
 @findex %define parse.lac
diff --git a/src/print.c b/src/print.c
index 0012a4f..b117e75 100644
--- a/src/print.c
+++ b/src/print.c
@@ -337,7 +337,7 @@ print_reductions (FILE *out, state *s)
       char *default_reductions =
         muscle_percent_define_get ("lr.default-reductions");
       print_reduction (out, width, _("$default"), default_reduction, true);
-      aver (0 == strcmp (default_reductions, "all")
+      aver (0 == strcmp (default_reductions, "full")
             || (0 == strcmp (default_reductions, "consistent")
                 && default_reduction_only)
             || (reds->num == 1 && reds->rules[0]->number == 0));
diff --git a/src/reader.c b/src/reader.c
index 852d3e1..9153f21 100644
--- a/src/reader.c
+++ b/src/reader.c
@@ -630,7 +630,7 @@ prepare_percent_define_front_end_variables (void)
     muscle_percent_define_default ("lr.type", "lalr");
     lr_type = muscle_percent_define_get ("lr.type");
     if (0 != strcmp (lr_type, "canonical-lr"))
-      muscle_percent_define_default ("lr.default-reductions", "all");
+      muscle_percent_define_default ("lr.default-reductions", "full");
     else
       muscle_percent_define_default ("lr.default-reductions", "accepting");
     free (lr_type);
@@ -640,7 +640,7 @@ prepare_percent_define_front_end_variables (void)
   {
     static char const * const values[] = {
       "lr.type", "lalr", "ielr", "canonical-lr", NULL,
-      "lr.default-reductions", "all", "consistent", "accepting", NULL,
+      "lr.default-reductions", "full", "consistent", "accepting", NULL,
       NULL
     };
     muscle_percent_define_check_values (values);
diff --git a/src/tables.c b/src/tables.c
index ef37fba..930a6a5 100644
--- a/src/tables.c
+++ b/src/tables.c
@@ -310,7 +310,7 @@ action_row (state *s)
   {
     char *default_reductions =
       muscle_percent_define_get ("lr.default-reductions");
-    if (0 != strcmp (default_reductions, "all") && !s->consistent)
+    if (0 != strcmp (default_reductions, "full") && !s->consistent)
       nodefault = true;
     free (default_reductions);
   }
diff --git a/tests/input.at b/tests/input.at
index 90b6b0b..9c5db8d 100644
--- a/tests/input.at
+++ b/tests/input.at
@@ -1034,7 +1034,7 @@ start: ;
 ]])
 AT_BISON_CHECK([[input.y]], [[1]], [[]],
 [[input.y:1.9-29: invalid value for %define variable `lr.default-reductions': 
`bogus'
-input.y:1.9-29: accepted value: `all'
+input.y:1.9-29: accepted value: `full'
 input.y:1.9-29: accepted value: `consistent'
 input.y:1.9-29: accepted value: `accepting'
 ]])
diff --git a/tests/reduce.at b/tests/reduce.at
index 65ccf16..ad4d329 100644
--- a/tests/reduce.at
+++ b/tests/reduce.at
@@ -1451,12 +1451,12 @@ dnl PARSER-EXIT-VALUE, PARSER-STDOUT, PARSER-STDERR
 m4_define([AT_TEST_LR_DEFAULT_REDUCTIONS],
 [
 AT_TEST_TABLES_AND_PARSE([[no %define lr.default-reductions]],
-                         [[all]], [[]],
+                         [[full]], [[]],
                          [[]],
                          [$1], [$2], [[]], [$3])
-AT_TEST_TABLES_AND_PARSE([[%define lr.default-reductions all]],
-                         [[all]], [[]],
-                         [[%define lr.default-reductions all]],
+AT_TEST_TABLES_AND_PARSE([[%define lr.default-reductions full]],
+                         [[full]], [[]],
+                         [[%define lr.default-reductions full]],
                          [$1], [$2], [[]], [$3])
 AT_TEST_TABLES_AND_PARSE([[%define lr.default-reductions consistent]],
                          [[consistent]], [[]],
@@ -1529,7 +1529,7 @@ state 3
     2      | a . b 'a'
     3      | a . c 'b'
     5 b: .  [$end, 'a']
-    6 c: .  ['b']]AT_COND_CASE([[all]], [[
+    6 c: .  ['b']]AT_COND_CASE([[full]], [[
 
     'b'       reduce using rule 6 (c)
     $default  reduce using rule 5 (b)]], [[
@@ -1556,7 +1556,8 @@ state 5
 
     'a'  shift, and go to state 7
 
-    ]AT_COND_CASE([[all]], [[$default]], [[$end]])[  reduce using rule 1 
(start)
+    ]AT_COND_CASE([[full]], [[$default]],
+                  [[$end]])[  reduce using rule 1 (start)
 
 
 state 6
-- 
1.7.0.4


>From 5da0355aff4de57e96aba7b788c376fc779d83b1 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <address@hidden>
Date: Sun, 6 Mar 2011 12:54:35 -0500
Subject: [PATCH 3/4] doc: clean up terminology for mysterious conflicts.

* doc/bison.texinfo (Mystery Conflicts): Rename node to...
(Mysterious Conflicts): ... this, which is already the section
title and the name used in the index.  Update all cross-references
to this node.  Also, don't imply that R/R conflicts are the only
kind of mysterious conflict.
---
 ChangeLog         |    9 +++++++++
 doc/bison.texinfo |   24 ++++++++++++------------
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 670853a..361d225 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,14 @@
 2011-03-06  Joel E. Denny  <address@hidden>
 
+       doc: clean up terminology for mysterious conflicts.
+       * doc/bison.texinfo (Mystery Conflicts): Rename node to...
+       (Mysterious Conflicts): ... this, which is already the section
+       title and the name used in the index.  Update all cross-references
+       to this node.  Also, don't imply that R/R conflicts are the only
+       kind of mysterious conflict.
+
+2011-03-06  Joel E. Denny  <address@hidden>
+
        lr.default-reductions: rename "all" value to "full".
        States that shift the error token do not have default reductions,
        and GLR disables some default reductions, so "all" was a misnomer.
diff --git a/doc/bison.texinfo b/doc/bison.texinfo
index 8d1ba68..a1889ec 100644
--- a/doc/bison.texinfo
+++ b/doc/bison.texinfo
@@ -264,7 +264,7 @@ The Bison Parser Algorithm
 * Contextual Precedence::  When an operator's precedence depends on context.
 * Parser States::     The parser is a finite-state-machine with stack.
 * Reduce/Reduce::     When two rules are applicable in the same situation.
-* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
+* Mysterious Conflicts:: Conflicts that look unjustified.
 * Tuning LR::         How to tune fundamental aspects of LR-based parsing.
 * Generalized LR Parsing::  Parsing arbitrary context-free grammars.
 * Memory Management:: What happens when memory is exhausted.  How to avoid it.
@@ -488,10 +488,10 @@ are called LR(1) grammars.  In brief, in these grammars, 
it must be possible
 to tell how to parse any portion of an input string with just a single token
 of lookahead.  For historical reasons, Bison by default is limited by the
 additional restrictions of LALR(1), which is hard to explain simply.
address@hidden Conflicts, ,Mysterious Reduce/Reduce Conflicts}, for more
-information on this.  As an experimental feature, you can escape these
-additional restrictions by requesting IELR(1) or canonical LR(1) parser
-tables.  @xref{LR Table Construction}, to learn how.
address@hidden Conflicts}, for more information on this.  As an
+experimental feature, you can escape these additional restrictions by
+requesting IELR(1) or canonical LR(1) parser tables.  @xref{LR Table
+Construction}, to learn how.
 
 @cindex GLR parsing
 @cindex generalized LR (GLR) parsing
@@ -6262,7 +6262,7 @@ This kind of parser is known in the literature as a 
bottom-up parser.
 * Contextual Precedence::  When an operator's precedence depends on context.
 * Parser States::     The parser is a finite-state-machine with stack.
 * Reduce/Reduce::     When two rules are applicable in the same situation.
-* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
+* Mysterious Conflicts:: Conflicts that look unjustified.
 * Tuning LR::         How to tune fundamental aspects of LR-based parsing.
 * Generalized LR Parsing::  Parsing arbitrary context-free grammars.
 * Memory Management:: What happens when memory is exhausted.  How to avoid it.
@@ -6779,8 +6779,8 @@ redirects:redirect
         ;
 @end example
 
address@hidden Mystery Conflicts
address@hidden Mysterious Reduce/Reduce Conflicts
address@hidden Mysterious Conflicts
address@hidden Mysterious Conflicts
 @cindex Mysterious Conflicts
 
 Sometimes reduce/reduce conflicts can occur that don't look warranted.
@@ -6936,7 +6936,7 @@ user feedback will help to stabilize them.
 For historical reasons, Bison constructs LALR(1) parser tables by default.
 However, LALR does not possess the full language-recognition power of LR.
 As a result, the behavior of parsers employing LALR parser tables is often
-mysterious.  We presented a simple example of this effect in @ref{Mystery
+mysterious.  We presented a simple example of this effect in @ref{Mysterious
 Conflicts}.
 
 As we also demonstrated in that example, the traditional approach to
@@ -6971,7 +6971,7 @@ grammar file:
 %define lr.type ielr
 @end example
 
address@hidden For the example in @ref{Mystery Conflicts}, the mysterious
address@hidden For the example in @ref{Mysterious Conflicts}, the mysterious
 conflict is then eliminated, so there is no need to invest time in
 comprehending the conflict or restructuring the grammar to fix it.  If,
 during future development, the grammar evolves such that all mysterious
@@ -7316,7 +7316,7 @@ sequence of reductions cannot have deterministic parsers 
in this sense.
 The same is true of languages that require more than one symbol of
 lookahead, since the parser lacks the information necessary to make a
 decision at the point it must be made in a shift-reduce parser.
-Finally, as previously mentioned (@pxref{Mystery Conflicts}),
+Finally, as previously mentioned (@pxref{Mysterious Conflicts}),
 there are languages where Bison's default choice of how to
 summarize the input seen so far loses necessary information.
 
@@ -10967,7 +10967,7 @@ Tokens}.
 @item LALR(1)
 The class of context-free grammars that Bison (like most other parser
 generators) can handle by default; a subset of LR(1).
address@hidden Conflicts, ,Mysterious Reduce/Reduce Conflicts}.
address@hidden Conflicts}.
 
 @item LR(1)
 The class of context-free grammars in which at most one token of
-- 
1.7.0.4


>From 121c498280f96b31a1f90e2012751509e6358a64 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <address@hidden>
Date: Sun, 6 Mar 2011 17:12:16 -0500
Subject: [PATCH 4/4] doc: cite publication for LAC.

* doc/bison.texinfo (LAC): Here.
---
 ChangeLog         |    5 +++++
 doc/bison.texinfo |    5 +++++
 2 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 361d225..69efad5 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2011-03-06  Joel E. Denny  <address@hidden>
 
+       doc: cite publication for LAC.
+       * doc/bison.texinfo (LAC): Here.
+
+2011-03-06  Joel E. Denny  <address@hidden>
+
        doc: clean up terminology for mysterious conflicts.
        * doc/bison.texinfo (Mystery Conflicts): Rename node to...
        (Mysterious Conflicts): ... this, which is already the section
diff --git a/doc/bison.texinfo b/doc/bison.texinfo
index a1889ec..ab6486a 100644
--- a/doc/bison.texinfo
+++ b/doc/bison.texinfo
@@ -7256,6 +7256,11 @@ never physically copied.  In our experience, the 
performance penalty of LAC
 has proven insignificant for practical grammars.
 @end itemize
 
+While the basic premise behind LAC has been recognized in the parser
+community for years, for the first publication that uses the term LAC and
+that discusses Bison's LAC implementation, @pxref{Bibliography,,Denny 2010
+May}.
+
 @node Unreachable States
 @subsection Unreachable States
 @findex %define lr.keep-unreachable-states
-- 
1.7.0.4




reply via email to

[Prev in Thread] Current Thread [Next in Thread]