[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH 02/21] yacc.c: introduce an enum that defines the symbol's number
From: |
Akim Demaille |
Subject: |
[PATCH 02/21] yacc.c: introduce an enum that defines the symbol's number |
Date: |
Wed, 1 Apr 2020 08:37:28 +0200 |
There's a number of advantage in exposing the symbol (internal)
numbers:
- custom error messages can use them to decide how to represent a
given symbol, or a set of symbols.
- we need something similar in uses of yyexpected_tokens. For
instance, currently, bistromathic's completion() reads:
int ntokens = expected_tokens (line, tokens, YYNTOKENS);
[...]
for (int i = 0; i < ntokens; ++i)
if (tokens[i] == YYTRANSLATE (TOK_VAR))
[...]
else if (tokens[i] == YYTRANSLATE (TOK_FUN))
[...]
else
[...]
- now that it's a compile-time expression, we can easily build static
tables, switch, etc.
- some users depended on the ability to get the token number from a
symbol to write test cases for their scanners. But Bison 3.5
removed the table this feature depended upon (a reverse
yytranslate). Now they can check against the actual symbol number,
without having pay (space and time) a conversion.
See https://lists.gnu.org/r/bug-bison/2020-01/msg00001.html, and
https://lists.gnu.org/archive/html/bug-bison/2020-03/msg00015.html.
- it helps us clearly separate the internal symbol numbers from the
external token numbers, whose difference is sometimes blurred in the
code when values coincide (e.g. "yychar = yytoken = YYEOF").
- it allows us to get rid of ugly macros with inconsistent names such
as YYUNDEFTOK and YYTERROR, and to group related definitions
together.
- similarly it provides a clean access to the $accept symbol (which
proves convenient in a current experimentation of mine with several
%start symbols).
Let's declare this type as a private type (in the *.c file, not
the *.h one). So it does not need to be influenced by the api prefix.
* data/skeletons/bison.m4 (b4_symbol_sid): New.
(b4_symbol): Use it.
* data/skeletons/c.m4 (b4_symbol_enum, b4_declare_symbol_enum): New.
* data/skeletons/yacc.c: Use b4_declare_symbol_enum.
(YYUNDEFTOK, YYTERROR): Remove.
Use the corresponding symbol enum instead.
---
TODO | 5 +++++
data/skeletons/bison.m4 | 14 ++++++++++++++
data/skeletons/c.m4 | 30 ++++++++++++++++++++++++++++++
data/skeletons/yacc.c | 21 ++++++++++-----------
4 files changed, 59 insertions(+), 11 deletions(-)
diff --git a/TODO b/TODO
index 196af194..f314682a 100644
--- a/TODO
+++ b/TODO
@@ -18,6 +18,8 @@ There is some confusion over these terms, which is even a
problem for
translators. We need something clear, especially if we provide access to
the symbol numbers (which would be useful for custom error messages).
+We could use "number" and "code".
+
*** The documentation
You can explicitly specify the numeric code for a token type...
@@ -42,6 +44,9 @@ uses "user token number" in most places.
complain (&loc, complaint, _("user token number of %s too large"),
sym->tag);
+*** M4
+Make it consistent with the rest (it uses "user_number" and "number").
+
** Symbol numbers
Giving names to symbol numbers would be useful in custom error messages. It
would actually also make the following point gracefully handled (status of
diff --git a/data/skeletons/bison.m4 b/data/skeletons/bison.m4
index ec5cb5c8..ab4fb720 100644
--- a/data/skeletons/bison.m4
+++ b/data/skeletons/bison.m4
@@ -405,6 +405,19 @@ m4_define([_b4_symbol],
[__b4_symbol([$1], [$2])])])
+# b4_symbol_sid(NUM)
+# ------------------
+# Build the symbol ID based for this symbol. Return empty
+# if that would produce an invalid symbol.
+m4_define([b4_symbol_sid],
+[m4_case([$1],
+ [0], [[YYSYMBOL_YYEOF]],
+ [m4_bmatch(m4_quote(b4_symbol([$1], [tag])),
+ [^\$accept$], [[YYSYMBOL_YYACCEPT]],
+ [^\$undefined$], [[YYSYMBOL_YYUNDEF]],
+ [m4_quote(b4_symbol_if([$1], [has_id],
+ [[YYSYMBOL_]]m4_quote(_b4_symbol([$1],
[id]))))])])])
+
# b4_symbol(NUM, FIELD)
# ---------------------
@@ -415,6 +428,7 @@ m4_define([b4_symbol],
[m4_case([$2],
[id], [m4_do([b4_percent_define_get([api.token.prefix])],
[_b4_symbol([$1], [id])])],
+ [sid], [b4_symbol_sid([$1])],
[_b4_symbol($@)])])
diff --git a/data/skeletons/c.m4 b/data/skeletons/c.m4
index c61080b6..721c66ea 100644
--- a/data/skeletons/c.m4
+++ b/data/skeletons/c.m4
@@ -486,6 +486,36 @@ m4_define([b4_symbol_translate],
+## --------------------------- ##
+## (Internal) symbol numbers. ##
+## --------------------------- ##
+
+
+# b4_symbol_enum(SYMBOL-NUM)
+# --------------------------
+# Output the definition of this symbol as an enum.
+m4_define([b4_symbol_enum],
+[m4_ifval(b4_symbol([$1], [sid]),
+ [m4_format([[%s = %s]],
+ b4_symbol([$1], [sid]),
+ b4_symbol([$1], [number]))])])
+
+
+# b4_declare_symbol_enum
+# ----------------------
+# The definition of the symbol internal numbers as an enum.
+m4_define([b4_declare_symbol_enum],
+[[/* Symbol type. */
+enum yysymbol_type_t
+{
+ ]m4_join([,
+ ],
+ b4_symbol_map([b4_symbol_enum]))[
+};
+typedef enum yysymbol_type_t yysymbol_type_t;
+]])])
+
+
## ----------------- ##
## Semantic Values. ##
## ----------------- ##
diff --git a/data/skeletons/yacc.c b/data/skeletons/yacc.c
index 0bf65601..ab4ea0cc 100644
--- a/data/skeletons/yacc.c
+++ b/data/skeletons/yacc.c
@@ -398,6 +398,7 @@ m4_if(b4_api_prefix, [yy], [],
[/* Use api.header.include to #include this
header
instead of duplicating it here. */
])b4_shared_declarations])[
+]b4_declare_symbol_enum[
]b4_user_post_prologue[
]b4_percent_code_get[]dnl
@@ -624,7 +625,6 @@ union yyalloc
/* YYNSTATES -- Number of states. */
#define YYNSTATES ]b4_states_number[
-#define YYUNDEFTOK ]b4_undef_token_number[
#define YYMAXUTOK ]b4_user_token_number_max[
@@ -633,7 +633,7 @@ union yyalloc
]b4_api_token_raw_if(dnl
[[#define YYTRANSLATE(YYX) (YYX)]],
[[#define YYTRANSLATE(YYX) \
- (0 <= (YYX) && (YYX) <= YYMAXUTOK ? yytranslate[YYX] : YYUNDEFTOK)
+ (0 <= (YYX) && (YYX) <= YYMAXUTOK ? yytranslate[YYX] : YYSYMBOL_YYUNDEF)
/* YYTRANSLATE[TOKEN-NUM] -- Symbol number corresponding to TOKEN-NUM
as returned by yylex. */
@@ -738,8 +738,6 @@ enum { YYNOMEM = -2 };
} \
while (0)
-/* Error symbol internal number. */
-#define YYTERROR 1
/* Error token external number. */
#define YYERRCODE ]b4_symbol(1, user_number)[
@@ -1021,7 +1019,7 @@ yy_lac (yy_state_t *yyesa, yy_state_t **yyes,
yy_state_t *yyesp = yyes_prev;
/* Reduce until we encounter a shift and thereby accept the token. */
YYDPRINTF ((stderr, "LAC: checking lookahead %s:", yysymbol_name (yytoken)));
- if (yytoken == YYUNDEFTOK)
+ if (yytoken == YYSYMBOL_YYUNDEF)
{
YYDPRINTF ((stderr, " Always Err\n"));
return 1;
@@ -1149,7 +1147,7 @@ yyexpected_tokens (const yyparse_context_t *yyctx,
]b4_lac_if([[
int yyx;
for (yyx = 0; yyx < YYNTOKENS; ++yyx)
- if (yyx != YYTERROR && yyx != YYUNDEFTOK)
+ if (yyx != YYSYMBOL_error && yyx != YYSYMBOL_YYUNDEF)
switch (yy_lac (]b4_push_if([[yyps->yyesa, &yyps->yyes,
&yyps->yyes_capacity, yyps->yyssp, yyx]],
[[yyctx->yyesa, yyctx->yyes,
yyctx->yyes_capacity, yyctx->yyssp, yyx]])[))
{
@@ -1177,7 +1175,7 @@ yyexpected_tokens (const yyparse_context_t *yyctx,
int yyxend = yychecklim < YYNTOKENS ? yychecklim : YYNTOKENS;
int yyx;
for (yyx = yyxbegin; yyx < yyxend; ++yyx)
- if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR
+ if (yycheck[yyx + yyn] == yyx && yyx != YYSYMBOL_error
&& !yytable_value_is_error (yytable[yyx + yyn]))
{
if (!yyarg)
@@ -1729,7 +1727,7 @@ yybackup:
/* Not known => get a lookahead token if don't already have one. */
- /* YYCHAR is either YYEMPTY or YYEOF or a valid lookahead symbol. */
+ /* YYCHAR is either empty, or end-of-input, or a valid lookahead. */
if (yychar == YYEMPTY)
{]b4_push_if([[
if (!yyps->yynew)
@@ -1757,7 +1755,8 @@ yyread_pushed_token:]])[
if (yychar <= YYEOF)
{
- yychar = yytoken = YYEOF;
+ yychar = YYEOF;
+ yytoken = YYSYMBOL_YYEOF;
YYDPRINTF ((stderr, "Now at end of input.\n"));
}
else
@@ -1999,8 +1998,8 @@ yyerrlab1:
yyn = yypact[yystate];
if (!yypact_value_is_default (yyn))
{
- yyn += YYTERROR;
- if (0 <= yyn && yyn <= YYLAST && yycheck[yyn] == YYTERROR)
+ yyn += YYSYMBOL_error;
+ if (0 <= yyn && yyn <= YYLAST && yycheck[yyn] == YYSYMBOL_error)
{
yyn = yytable[yyn];
if (0 < yyn)
--
2.26.0
- [PATCH 00/21] Use, Akim Demaille, 2020/04/01
- [PATCH 01/21] style: comment changes about token numbers, Akim Demaille, 2020/04/01
- [PATCH 02/21] yacc.c: introduce an enum that defines the symbol's number,
Akim Demaille <=
- [PATCH 04/21] yacc.c: use yysymbol_type_t instead of int for yytoken, Akim Demaille, 2020/04/01
- [PATCH 03/21] regen, Akim Demaille, 2020/04/01
- [PATCH 06/21] regen, Akim Demaille, 2020/04/01
- [PATCH 05/21] yacc.c: also define a symbol number for the empty token, Akim Demaille, 2020/04/01
- [PATCH 08/21] regen, Akim Demaille, 2020/04/01
- [PATCH 07/21] yacc.c: prefer YYSYMBOL_YYERROR to YYSYMBOL_error, Akim Demaille, 2020/04/01
- [PATCH 09/21] bistromathic: use symbol numbers instead of YYTRANSLATE, Akim Demaille, 2020/04/01
- [PATCH 11/21] regen, Akim Demaille, 2020/04/01
- [PATCH 10/21] yysymbol_type_t: always assign an enumerator, Akim Demaille, 2020/04/01
- [PATCH 12/21] yacc.c: revert to not using yysymbol_type_t in the yytranslate table, Akim Demaille, 2020/04/01