bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] doc: cex documentation


From: Akim Demaille
Subject: Re: [PATCH] doc: cex documentation
Date: Tue, 30 Jun 2020 06:47:52 +0200

Hi Vincent,

> Le 28 juin 2020 à 21:30, Vincent Imbimbo <vmi6@cornell.edu> a écrit :
> 
> Hey Akim,
> I think I found a good place for the cex documentation in bison.texi, but it 
> might still be a bit awkward. Tell me if you have any ideas for changes.

Thanks a lot for this!  I made a few changes (e.g., the name is 
-Wcounterexamples, plural, moving -Wcex as foremost feature of 3.7, wrap at 76 
columns, remove the duplicate example in NEWS which is B&W, sort references, 
etc.)

Here's how I installed your commit.  I'll proceed from there.  Thanks!

commit ecd731a84e8b75937bb99d5a6d4f20cb9fecf4b3
Author: Vincent Imbimbo <vmi6@cornell.edu>
Date:   Sun Jun 28 15:30:17 2020 -0400

    doc: cex documentation
    
    * NEWS, doc/bison.texi: Add documentation for conflict counterexample
    generation.

diff --git a/NEWS b/NEWS
index 9e1c5076..b3a5d1ea 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,30 @@ GNU Bison NEWS
 
 ** New features
 
+*** Counterexample Generation
+
+  When given `--report=counterexamples` or `-Wcounterexamples`, bison will
+  now output counterexamples for conflicts in the grammar.  These are
+  strings in the grammar which can be parsed in two ways due to the
+  conflict.  For example:
+
+    Example              exp '+' exp • '/' exp
+    First derivation     exp ::=[ exp ::=[ exp '+' exp • ] '/' exp ]
+    Second derivation    exp ::=[ exp '+' exp ::=[ exp • '/' exp ] ]
+
+  This is a shift/reduce conflict caused by none of the operators having
+  precedence, so the example can be parsed in the two ways shown.   When
+  bison cannot find an example that can be derived in two ways, it instead
+  generates two examples that are the same up until the dot:
+
+    First example        expr • ID $end
+    First derivation     $accept ::=[ s ::=[ a ::=[ expr • ] ID ] $end ]
+    Second example       expr • ID ',' ID $end
+    Second derivation    $accept ::=[ s ::=[ a ::=[ expr ::=[ expr • ID ',' ] 
] ID ] $end ]
+
+  In these cases, the parser usually doesn't have enough lookahead to
+  differentiate the two given examples.
+
 *** File prefix mapping
 
   Bison learned a new argument, `--file-prefix-map OLD=NEW`.  Any file path
diff --git a/doc/bison.texi b/doc/bison.texi
index 320a645b..e7225306 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -9797,12 +9797,69 @@ calc.y:19.1-7: @dwarning{warning}: nonterminal useless 
in grammar: useless [@dwa
    19 | @dwarning{useless: STR;}
       | @dwarning{^~~~~~~}
 calc.y: @dwarning{warning}: 7 shift/reduce conflicts 
[@dwarning{-Wconflicts-sr}]
+calc.y: @dwarning{warning}: rerun with option '-Wcounterexamples' to generate 
conflict counterexamples [@dwarning{-Wother}]
 @end example
 
-When given @option{--report=state}, in addition to @file{calc.tab.c}, it
-creates a file @file{calc.output} with contents detailed below.  The
-order of the output and the exact presentation might vary, but the
-interpretation is the same.
+When given @option{-Wcounterexamples}, @command{bison} will run a search for
+strings in your grammar that better demonstrate you
+conflicts. Counterexample generation was initially developed by Chinawat
+Isradisaikul and Andrew Myers (@pxref{Bibliography}).  For @file{calc.y},
+the first printed example is:
+
+@example
+Shift/reduce conflict on token '/':
+  Example              exp '+' exp • '/' exp
+  First derivation     exp ::=[ exp ::=[ exp '+' exp • ] '/' exp ]
+  Example              exp '+' exp • '/' exp
+  Second derivation    exp ::=[ exp '+' exp ::=[ exp • '/' exp ] ]
+@end example
+
+This shows two separate derivations in the grammar for the same @code{exp}:
+@samp{e1 + e2 / e3}.  The derivations show how your rules would parse the
+given example. Here, the first derivation completes a reduction when seeing
+@samp{/}, causing @samp{e1 + e2} to be grouped as an @code{exp}. The second
+derivation shifts on @samp{/}, resulting in @samp{e2 / e3} being grouped as
+an @code{exp}. Therefore, it is easy to see that adding @code{%precedence}
+directives would fix this conflict.
+
+Sometimes, the search will not find an example that can be derived in two
+ways.  In these cases, counterexample generation will provide two examples
+that are the same up until the dot.  Most notably, this will happen when
+your grammar requires a stronger parser (more lookahead, LR instead of
+LALR).  The following example isn't LR(1):
+
+@example
+%token ID
+%%
+s: a ID
+a: expr
+expr: %empty | expr ID ','
+@end example
+
+@command{bison} reports:
+
+@example
+Shift/reduce conflict on token ID:
+  First example        expr • ID $end
+  First derivation     $accept ::=[ s ::=[ a ::=[ expr • ] ID ] $end ]
+  Second example       expr • ID ',' ID $end
+  Second derivation    $accept ::=[ s ::=[ a ::=[ expr ::=[ expr • ID ',' ] ] 
ID ] $end ]
+@end example
+
+This conflict is caused by the parser not having enough information to know
+the difference between these two examples.  The parser would need an
+additional lookahead token to know whether or not a comma follows the
+@code{ID} after @code{expr}.  These types of conflicts tend to be more
+difficult to fix, and usually need a rework of the grammar.  In this case,
+it can be fixed by changing around the recursion: @code{expr: ID | ',' expr
+ID}.
+
+Counterexamples can also be written to a file with @option{--report=cex}.
+
+Going back to the calc example, when given @option{--report=state},
+in addition to @file{calc.tab.c}, it creates a file @file{calc.output}
+with contents detailed below.  The order of the output and the exact
+presentation might vary, but the interpretation is the same.
 
 @noindent
 @cindex token, useless
@@ -15211,6 +15268,7 @@ resolution.  @xref{Unreachable States}.
 @unnumbered Bibliography
 
 @c Please follow the following canvas to add more references.
+@c And keep sorted alphabetically.
 
 @table @asis
 @item [Corbett 1984]
@@ -15252,6 +15310,14 @@ Look-Ahead Sets, in @cite{ACM Transactions on 
Programming Languages and
 Systems}, Vol.@: 4, No.@: 4 (October 1982), pp.@:
 615--649. @uref{http://dx.doi.org/10.1145/69622.357187}
 
+@item [Isradisaikul, Myers 2015]
+Chinawat Isradisaikul, Andrew Myers,
+Finding Counterexamples from Parsing Conflicts,
+in @cite{Proceedings of the 36th ACM SIGPLAN Conference on
+Programming Language Design and Implementation} (PLDI '15),
+ACM, pp.@: 555--564.
+@uref{https://www.cs.cornell.edu/andru/papers/cupex/cupex.pdf}
+
 @item [Johnson 1978]
 Steven C. Johnson,
 A portable compiler: theory and practice,
@@ -15345,17 +15411,18 @@ London, Department of Computer Science, TR-00-12 
(December 2000).
 @c LocalWords: typename emplace Wconversion Wshorten yacchack reentrancy ou
 @c LocalWords: Relocatability exprs fixit Wyacc parseable fixits ffixit svg
 @c LocalWords: DNDEBUG cstring Wzero workalike POPL workalikes byacc UCB
-@c LocalWords: Penello's Penello Byson Byson's Corbett's CSD TOPLAS PDP
+@c LocalWords: Penello's Penello Byson Byson's Corbett's CSD TOPLAS PDP cex
 @c LocalWords: Beazley's goyacc ocamlyacc SIGACT SIGPLAN colorWarning exVal
 @c LocalWords: setcolor rgbError colorError rgbNotice colorNotice derror
 @c LocalWords: colorOff maincolor inlineraw darkviolet darkcyan dwarning
 @c LocalWords: dnotice copyable stdint ptrdiff bufsize yyreport invariants
-@c LocalWords: xrefautomaticsectiontitle yysyntax yysymbol ARGMAX cond
+@c LocalWords: xrefautomaticsectiontitle yysyntax yysymbol ARGMAX cond RTTI
 @c LocalWords: Wdangling yytoken erreur syntaxe inattendu attendait nombre
 @c LocalWords: YYUNDEF SymbolKind yypcontext YYENOMEM TOKENMAX getBundle
-@c LocalWords: ResourceBundle myResources getString getName getToken
+@c LocalWords: ResourceBundle myResources getString getName getToken ylwrap
 @c LocalWords: getLocation getExpectedTokens reportSyntaxError bistromathic
-@c LocalWords: TokenKind
+@c LocalWords: TokenKind Automake's rtti Wcounterexamples Chinawat PLDI
+@c LocalWords: Isradisaikul
 
 @c Local Variables:
 @c ispell-dictionary: "american"




reply via email to

[Prev in Thread] Current Thread [Next in Thread]