[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bison 3.5.4 released
From: |
Partha Acharya |
Subject: |
Re: Bison 3.5.4 released |
Date: |
Sun, 5 Apr 2020 13:09:30 +0530 |
Well done Akim, you made it possible even during this tough time.
Regards,
Partha.
On Sun, 5 Apr 2020, 13:06 Akim Demaille, <address@hidden> wrote:
> ** WARNING: Future backward-incompatibilities! **
>
> TL;DR: replace "#define YYERROR_VERBOSE 1" by "%define parse.error
> verbose".
>
> Bison 3.6 will no longer support the YYERROR_VERBOSE macro; the parsers
> that still depend on it will produce Yacc-like error messages (just
> "syntax error"). It was superseded by the "%error-verbose" directive in
> Bison 1.875 (2003-01-01). Bison 2.6 (2012-07-19) clearly announced that
> support for YYERROR_VERBOSE would be removed. Note that since Bison 3.0
> (2013-07-25), "%error-verbose" is deprecated in favor of "%define
> parse.error verbose".
>
>
> Bison 3.5.4 fixes a few minor issues from Bison 3.5.
>
> In Bison 3.5 Paul Eggert revised the use of integral types in both the
> generator and the generated parsers. As a consequence small parsers
> have a smaller footprint, and very large automata are now possible
> with the default back-end (yacc.c). If you are interested in making
> your parser smaller, have a look at api.token.raw.
>
> Adrian Vogelsgesang contributed lookahead correction for C++.
>
> The purpose of string literals has been clarified. Indeed, they are used
> for two different purposes: freeing from having to implement the keyword
> matching in the scanner, and improving error messages. Most of the time
> both can be achieved at the same time, but on occasions, it does not work
> so
> well. We promote their use for error messages. We still support the
> former
> case (at least for historical skeletons), but it is _not_ a recommended
> practice. The documentation now warns against this use. A new warning,
> -Wdangling-alias, should help users who want to enforce the use of aliases
> only for error messages.
>
> An experimental back-end for the D programming language was added thanks to
> Oliver Mangold and H. S. Teoh. It is looking for active support from the D
> community.
>
> Happy parsing!
>
> ==================================================================
>
> Bison is a general-purpose parser generator that converts an annotated
> context-free grammar into a deterministic LR or generalized LR (GLR) parser
> employing LALR(1) parser tables. Bison can also generate IELR(1) or
> canonical LR(1) parser tables. Once you are proficient with Bison, you can
> use it to develop a wide range of language parsers, from those used in
> simple desk calculators to complex programming languages.
>
> Bison is upward compatible with Yacc: all properly-written Yacc grammars
> work with Bison with no change. Anyone familiar with Yacc should be able
> to
> use Bison with little trouble. You need to be fluent in C, C++ or Java
> programming in order to use Bison.
>
> Here is the GNU Bison home page:
> https://gnu.org/software/bison/
>
> ==================================================================
>
> Here are the compressed sources:
> https://ftp.gnu.org/gnu/bison/bison-3.5.4.tar.gz (5.1MB)
> https://ftp.gnu.org/gnu/bison/bison-3.5.4.tar.xz (3.1MB)
>
> Here are the GPG detached signatures[*]:
> https://ftp.gnu.org/gnu/bison/bison-3.5.4.tar.gz.sig
> https://ftp.gnu.org/gnu/bison/bison-3.5.4.tar.xz.sig
>
> Use a mirror for higher download bandwidth:
> https://www.gnu.org/order/ftp.html
>
> [*] Use a .sig file to verify that the corresponding file (without the
> .sig suffix) is intact. First, be sure to download both the .sig file
> and the corresponding tarball. Then, run a command like this:
>
> gpg --verify bison-3.5.4.tar.gz.sig
>
> If that command fails because you don't have the required public key,
> then run this command to import it:
>
> gpg --keyserver keys.gnupg.net --recv-keys 0DDCAA3278D5264E
>
> and rerun the 'gpg --verify' command.
>
> This release was bootstrapped with the following tools:
> Autoconf 2.69
> Automake 1.16.2
> Flex 2.6.4
> Gettext 0.19.8.1
> Gnulib v0.1-3322-gd279bc6d9
>
> ==================================================================
>
> NEWS
>
> * Noteworthy changes in release 3.5.4 (2020-04-05) [stable]
>
> ** WARNING: Future backward-incompatibilities!
>
> TL;DR: replace "#define YYERROR_VERBOSE 1" by "%define parse.error
> verbose".
>
> Bison 3.6 will no longer support the YYERROR_VERBOSE macro; the parsers
> that still depend on it will produce Yacc-like error messages (just
> "syntax error"). It was superseded by the "%error-verbose" directive in
> Bison 1.875 (2003-01-01). Bison 2.6 (2012-07-19) clearly announced that
> support for YYERROR_VERBOSE would be removed. Note that since Bison 3.0
> (2013-07-25), "%error-verbose" is deprecated in favor of "%define
> parse.error verbose".
>
> ** Bug fixes
>
> Fix portability issues of the package itself on old compilers.
>
> Fix api.token.raw support in Java.
>
> * Noteworthy changes in release 3.5.3 (2020-03-08) [stable]
>
> ** Bug fixes
>
> Error messages could quote lines containing zero-width characters (such
> as
> \005) with incorrect styling. Fixes for similar issues with unexpectedly
> short lines (e.g., the file was changed between parsing and diagnosing).
>
> Several unlikely crashes found by fuzzing have been fixed.
>
> * Noteworthy changes in release 3.5.2 (2020-02-13) [stable]
>
> ** Bug fixes
>
> Portability issues and minor cosmetic issues.
>
> The lalr1.cc skeleton properly rejects unsupported values for parse.lac
> (as yacc.c does).
>
> * Noteworthy changes in release 3.5.1 (2020-01-19) [stable]
>
> ** Bug fixes
>
> Portability fixes.
>
> Fix compiler warnings.
>
> * Noteworthy changes in release 3.5 (2019-12-11) [stable]
>
> ** Backward incompatible changes
>
> Lone carriage-return characters (aka \r or ^M) in the grammar files are
> no
> longer treated as end-of-lines. This changes the diagnostics, and in
> particular their locations.
>
> In C++, line numbers and columns are now represented as 'int' not
> 'unsigned', so that integer overflow on positions is easily checkable via
> 'gcc -fsanitize=undefined' and the like. This affects the API for
> positions. The default position and location classes now expose
> 'counter_type' (int), used to define line and column numbers.
>
> ** Deprecated features
>
> The YYPRINT macro, which works only with yacc.c and only for tokens, was
> obsoleted long ago by %printer, introduced in Bison 1.50 (November 2002).
> It is deprecated and its support will be removed eventually.
>
> ** New features
>
> *** Lookahead correction in C++
>
> Contributed by Adrian Vogelsgesang.
>
> The C++ deterministic skeleton (lalr1.cc) now supports LAC, via the
> %define variable parse.lac.
>
> *** Variable api.token.raw: Optimized token numbers (all skeletons)
>
> In the generated parsers, tokens have two numbers: the "external" token
> number as returned by yylex (which starts at 257), and the "internal"
> symbol number (which starts at 3). Each time yylex is called, a table
> lookup maps the external token number to the internal symbol number.
>
> When the %define variable api.token.raw is set, tokens are assigned their
> internal number, which saves one table lookup per token, and also saves
> the generation of the mapping table.
>
> The gain is typically moderate, but in extreme cases (very simple user
> actions), a 10% improvement can be observed.
>
> *** Generated parsers use better types for states
>
> Stacks now use the best integral type for state numbers, instead of
> always
> using 15 bits. As a result "small" parsers now have a smaller memory
> footprint (they use 8 bits), and there is support for large automata (16
> bits), and extra large (using int, i.e., typically 31 bits).
>
> *** Generated parsers prefer signed integer types
>
> Bison skeletons now prefer signed to unsigned integer types when either
> will do, as the signed types are less error-prone and allow for better
> checking with 'gcc -fsanitize=undefined'. Also, the types chosen are now
> portable to unusual machines where char, short and int are all the same
> width. On non-GNU platforms this may entail including <limits.h> and (if
> available) <stdint.h> to define integer types and constants.
>
> *** A skeleton for the D programming language
>
> For the last few releases, Bison has shipped a stealth experimental
> skeleton: lalr1.d. It was first contributed by Oliver Mangold, based on
> Paolo Bonzini's lalr1.java, and was cleaned and improved thanks to
> H. S. Teoh.
>
> However, because nobody has committed to improving, testing, and
> documenting this skeleton, it is not clear that it will be supported in
> the future.
>
> The lalr1.d skeleton *is functional*, and works well, as demonstrated in
> examples/d/calc.d. Please try it, enjoy it, and... commit to support it.
>
> *** Debug traces in Java
>
> The Java backend no longer emits code and data for parser tracing if the
> %define variable parse.trace is not defined.
>
> ** Diagnostics
>
> *** New diagnostic: -Wdangling-alias
>
> String literals, which allow for better error messages, are (too)
> liberally accepted by Bison, which might result in silent errors. For
> instance
>
> %type <exVal> cond "condition"
>
> does not define "condition" as a string alias to 'cond' (nonterminal
> symbols do not have string aliases). It is rather equivalent to
>
> %nterm <exVal> cond
> %token <exVal> "condition"
>
> i.e., it gives the type 'exVal' to the "condition" token, which was
> clearly not the intention.
>
> Also, because string aliases need not be defined, typos such as "baz"
> instead of "bar" will be not reported.
>
> The option -Wdangling-alias catches these situations. On
>
> %token BAR "bar"
> %type <ival> foo "foo"
> %%
> foo: "baz" {}
>
> bison -Wdangling-alias reports
>
> warning: string literal not attached to a symbol
> | %type <ival> foo "foo"
> | ^~~~~
> warning: string literal not attached to a symbol
> | foo: "baz" {}
> | ^~~~~
>
> The -Wall option does not (yet?) include -Wdangling-alias.
>
> *** Better POSIX Yacc compatibility diagnostics
>
> POSIX Yacc restricts %type to nonterminals. This is now diagnosed by
> -Wyacc.
>
> %token TOKEN1
> %type <ival> TOKEN1 TOKEN2 't'
> %token TOKEN2
> %%
> expr:
>
> gives with -Wyacc
>
> input.y:2.15-20: warning: POSIX yacc reserves %type to nonterminals
> [-Wyacc]
> 2 | %type <ival> TOKEN1 TOKEN2 't'
> | ^~~~~~
> input.y:2.29-31: warning: POSIX yacc reserves %type to nonterminals
> [-Wyacc]
> 2 | %type <ival> TOKEN1 TOKEN2 't'
> | ^~~
> input.y:2.22-27: warning: POSIX yacc reserves %type to nonterminals
> [-Wyacc]
> 2 | %type <ival> TOKEN1 TOKEN2 't'
> | ^~~~~~
>
> *** Diagnostics with insertion
>
> The diagnostics now display the suggestion below the underlined source.
> Replacement for undeclared symbols are now also suggested.
>
> $ cat /tmp/foo.y
> %%
> list: lis '.' |
>
> $ bison -Wall foo.y
> foo.y:2.7-9: error: symbol 'lis' is used, but is not defined as a
> token and has no rules; did you mean 'list'?
> 2 | list: lis '.' |
> | ^~~
> | list
> foo.y:2.16: warning: empty rule without %empty [-Wempty-rule]
> 2 | list: lis '.' |
> | ^
> | %empty
> foo.y: warning: fix-its can be applied. Rerun with option '--update'.
> [-Wother]
>
> *** Diagnostics about long lines
>
> Quoted sources may now be truncated to fit the screen. For instance, on
> a
> 30-column wide terminal:
>
> $ cat foo.y
> %token FOO FOO FOO
> %%
> exp: FOO
> $ bison foo.y
> foo.y:1.34-36: warning: symbol FOO redeclared [-Wother]
> 1 | … FOO …
> | ^~~
> foo.y:1.8-10: previous declaration
> 1 | %token FOO …
> | ^~~
> foo.y:1.62-64: warning: symbol FOO redeclared [-Wother]
> 1 | … FOO
> | ^~~
> foo.y:1.8-10: previous declaration
> 1 | %token FOO …
> | ^~~
>
> ** Changes
>
> *** Debugging glr.c and glr.cc
>
> The glr.c skeleton always had asserts to check its own behavior (not the
> user's). These assertions are now under the control of the parse.assert
> %define variable (disabled by default).
>
> *** Clean up
>
> Several new compiler warnings in the generated output have been avoided.
> Some unused features are no longer emitted. Cleaner generated code in
> general.
>
> ** Bug Fixes
>
> Portability issues in the test suite.
>
> In theory, parsers using %nonassoc could crash when reporting verbose
> error messages. This unlikely bug has been fixed.
>
> In Java, %define api.prefix was ignored. It now behaves as expected.
>
>
>