[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH 04/10] doc: promote YYEOF
From: |
Akim Demaille |
Subject: |
[PATCH 04/10] doc: promote YYEOF |
Date: |
Mon, 13 Apr 2020 17:43:35 +0200 |
* NEWS (Deep overhaul of the symbol and token kinds): New.
* doc/bison.texi: Promote YYEOF over "0" in scanners.
(Token Decl): No longer show YYEOF here, it now works by default.
(Token I18n): More details about YYEOF here.
(Calc++): Just use YYEOF.
---
NEWS | 35 ++++++++++++++++++++++++++++--
doc/bison.texi | 59 ++++++++++++++++++++------------------------------
2 files changed, 56 insertions(+), 38 deletions(-)
diff --git a/NEWS b/NEWS
index 3fc4eaae..ceaca65b 100644
--- a/NEWS
+++ b/NEWS
@@ -74,7 +74,6 @@ GNU Bison NEWS
%token
PLUS "+"
MINUS "-"
- EOF 0 _("end of file")
<double>
NUM _("double precision number")
<symrec*>
@@ -83,7 +82,7 @@ GNU Bison NEWS
In that case the user must define _() and N_(), and yysymbol_name returns
the translated symbol (i.e., it returns '_("variable")' rather that
- '"variable"').
+ '"variable"'). In Java, the user must provide an i18n() function.
*** List of expected tokens (yacc.c)
@@ -95,6 +94,38 @@ GNU Bison NEWS
It makes little sense to use this feature without enabling LAC (lookahead
correction).
+*** Deep overhaul of the symbol and token kinds
+
+ To avoid the confusion with typing in programming languages, we now refer
+ to token and symbol "kinds" instead of token and symbol "types".
+
+**** Token kind
+
+ The "token kind" is what is returned by the scanner, e.g., PLUS, NUMBER,
+ LPAREN, etc. Users are invited to replace their uses of "enum
+ yytokentype" by "yytoken_kind_t".
+
+ This type now also includes tokens that were proviously hidden: YYEOF (end
+ of input), YYUNDEF (undefined token), and YYERRCODE (error token). They
+ now have string aliases, internationalized if internationalization is
+ enabled. Therefore, by default, error messages now refer to "end of file"
+ (internationalized) rather than the cryptic "$end".
+
+ In most case, it is now useless to define the end-of-line token as
+ follows:
+
+ %token EOF 0 _("end of file")
+
+ Rather simply use "YYEOF" in your scanner.
+
+**** Symbol kinds
+
+ The "symbol kinds" is what the parser actually uses. (Unless the
+ api.token.raw %define variable was used, the internal symbol kind of a
+ terminal differs from the corresponding token kind.)
+
+ They are now exposed as a enum, "yysymbol_kind_t".
+
*** Modernize display of explanatory statements in diagnostics
Since Bison 2.7, output was indented four spaces for explanatory
diff --git a/doc/bison.texi b/doc/bison.texi
index 8d448e4b..2d6cc327 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -1903,7 +1903,7 @@ yylex (void)
@group
/* Return end-of-input. */
else if (c == EOF)
- return 0;
+ return YYEOF;
/* Return a single char. */
else
return c;
@@ -2352,7 +2352,7 @@ yylex (void)
/* Return end-of-input. */
if (c == EOF)
- return 0;
+ return YYEOF;
@group
/* Return a single char, and update location. */
@@ -2722,7 +2722,7 @@ yylex (void)
c = getchar ();
if (c == EOF)
- return 0;
+ return YYEOF;
@end group
@group
@@ -4926,14 +4926,6 @@ would produce in French @samp{erreur de syntaxe, ||
inattendu, attendait
nombre ou (} rather than @samp{erreur de syntaxe, || inattendu, attendait
number ou (}.
-The token numbered as 0 corresponds to the end of file; the following line
-allows for nicer error messages referring to ``end of file''
-(internationalized) instead of ``$end'':
-
-@example
-%token END 0 _("end of file")
-@end example
-
@node Precedence Decl
@subsection Operator Precedence
@cindex precedence declarations
@@ -7812,7 +7804,6 @@ or @code{detailed}, token aliases can be
internationalized:
@example
%token
'\n' _("end of line")
- EOF 0 _("end of file")
<double>
NUM _("double precision number")
<symrec*>
@@ -7828,17 +7819,26 @@ If at least one token alias is internationalized, then
the generated parser
will use both @code{N_} and @code{_}, that must be defined
(@pxref{Programmers, , The Programmer’s View, gettext, GNU @code{gettext}
utilities}). They are used only on string aliases marked for translation.
-In other words, even if your catalog features a translation for ``end of
-line'', then with
+In other words, even if your catalog features a translation for
+``function'', then with
@example
%token
- '\n' "end of line"
- EOF 0 _("end of file")
+ <symrec*>
+ FUN "function"
+ VAR _("variable")
@end example
@noindent
-``end of line'' will appear untranslated in debug traces and error messages.
+``function'' will appear untranslated in debug traces and error messages.
+
+Unless defined by the user, the end-of-file token, @code{YYEOF}, is provided
+``end of file'' as an alias. It is also internationalized if the user
+internationalized tokens. To map it to another string, use:
+
+@example
+%token END 0 _("end of input")
+@end example
@node Algorithm
@@ -11401,17 +11401,7 @@ Symbols}). This directive:
@noindent
requests that Bison generates the functions @code{make_TEXT} and
-@code{make_NUMBER}. As a matter of fact, it is convenient to have also a
-symbol to mark the end of input, say @code{END_OF_FILE}:
-
-@comment file: c++/simple.yy: 1
-@example
-%token END_OF_FILE 0
-@end example
-
-@noindent
-The @code{0} tells Bison this token is special: when it is reached, parsing
-finishes.
+@code{make_NUMBER}, but also @code{make_YYEOF}, for the end of input.
Everything is in place for our scanner:
@@ -11441,7 +11431,7 @@ Everything is in place for our scanner:
@end group
@group
default:
- return parser::make_END_OF_FILE ();
+ return parser::make_YYEOF ();
@end group
@}
@}
@@ -12439,17 +12429,14 @@ file; it needs detailed knowledge about the driver.
@noindent
-The token code 0 corresponds to end of file; the following line
-allows for nicer error messages referring to ``end of file'' instead of
-``$end''. Similarly user friendly names are provided for each symbol. To
-avoid name clashes in the generated files (@pxref{Calc++ Scanner}), prefix
-tokens with @code{TOK_} (@pxref{%define Summary}).
+User friendly names are provided for each symbol. To avoid name clashes in
+the generated files (@pxref{Calc++ Scanner}), prefix tokens with @code{TOK_}
+(@pxref{%define Summary}).
@comment file: calc++/parser.yy
@example
%define api.token.prefix @{TOK_@}
%token
- END 0 "end of file"
ASSIGN ":="
MINUS "-"
PLUS "+"
@@ -12695,7 +12682,7 @@ The rules are simple. The driver is used to report
errors.
(loc, "invalid character: " + std::string(yytext));
@}
@end group
-<<EOF>> return yy::parser::make_END (loc);
+<<EOF>> return yy::parser::make_YYEOF (loc);
%%
@end example
--
2.26.0
- [PATCH 00/10] Documentation and fixes, Akim Demaille, 2020/04/13
- [PATCH 02/10] regen, Akim Demaille, 2020/04/13
- [PATCH 01/10] c, c++: also define YYEMPTY in yytoken_kind_t, Akim Demaille, 2020/04/13
- [PATCH 03/10] d: put YYEMPTY in the TokenKind, Akim Demaille, 2020/04/13
- [PATCH 04/10] doc: promote YYEOF,
Akim Demaille <=
- [PATCH 06/10] style: java: get closer to the Java style, Akim Demaille, 2020/04/13
- [PATCH 08/10] java: fix names, Akim Demaille, 2020/04/13
- [PATCH 05/10] doc: c++: document parser::context, Akim Demaille, 2020/04/13
- [PATCH 09/10] java: promote YYEOF rather that Lexer.EOF, Akim Demaille, 2020/04/13
- [PATCH 07/10] doc: java: SymbolKind, etc., Akim Demaille, 2020/04/13
- [PATCH 10/10] doc: more about the coding style, Akim Demaille, 2020/04/13