[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RIP: c++: merge symbol_type and token
From: |
Akim Demaille |
Subject: |
RIP: c++: merge symbol_type and token |
Date: |
Sun, 23 Dec 2018 11:17:34 +0100 |
As I mentioned earlier, I would be very happy to merge parser::token
and parser::symbol_type together. It works well, and I was very
happy about that.
But it's not so simple. The problem is that then parser::token looks
like this:
/// "External" symbols: returned by the scanner.
struct token : basic_symbol<by_type>
{
/// Superclass.
typedef basic_symbol<by_type> super_type;
/// Empty symbol.
token () {};
/// Constructor for valueless symbols, and symbols from each type.
token (int tok);
token (int tok, int v);
enum token_type
{
foo = 258,
bar = 259
};
};
and then... I can have name clashes if the user defines the name
"token", but also all the names inherited from basic_symbol.
That's annoying.
So I don't think I can follow this track.
FTR, below the implementation patch comes a patch about the testsuite.
It gives an idea of how the usage would have been.
commit 1325d23f493c9c62ec7cb89444186a72ea76295c
Author: Akim Demaille <address@hidden>
Date: Sat Dec 22 17:48:36 2018 +0100
c++: merge symbol_type and token
So far, we expose two related, but different, types: parser::token and
parser::symbol_type.
The enum that defines the token types is parser::token::yytokentype,
parser::token serving only to avoid that the enum items "leak" into
parser (when this was introduced, there were no enum classes in C++).
The external symbols (returned by yylex) are instances of
parser::symbol_type.
Now that symbol_type has clean and documented constructors, it feels
weird that parser::symbol_type and parser::token are two different
classes. This commit fuse them.
There is one technical problem: symbol_type would then define
yytokentype, yet its superclass (basic_symbol<by_type>) needs
yytokentype (in by_type). So actually define yytokentype in by_type
and expose it in token/symbol_type.
* data/c++.m4 (b4_token_enums): Define `token_type`, since that's the
exposed name.
To ensure backward compatibility, we will expose `yytokentype` as an
alias.
(b4_public_types_declare): No longer define `yytokentype`.
(by_type): Define `token_type`.
(symbol_type): Rename as...
(token): this.
(symbol_type): New alias, for backward compatibility.
* data/lalr1.cc: Use `token` instead of `symbol_type`.
* data/variant.hh: Likewise.
* tests/c++.at: Make sure yy::parser::token::yytokentype is still
visible.
* data/glr.cc: Since b4_public_types_declare no longer defines token,
do it by hand.
diff --git a/data/c++.m4 b/data/c++.m4
index 6a0604c0..aae28c88 100644
--- a/data/c++.m4
+++ b/data/c++.m4
@@ -162,7 +162,7 @@
m4_bpatsubst(m4_dquote(m4_bpatsubst(m4_dquote(b4_namespace_ref[ ]),
# --------------
# Output the definition of the tokens as enums.
m4_define([b4_token_enums],
-[[enum yytokentype
+[[enum token_type
{
]m4_join([,
],
@@ -218,15 +218,6 @@ m4_define([b4_public_types_declare],
location_type location;])[
};
- /// Tokens.
- struct token
- {
- ]b4_token_enums[
- };
-
- /// (External) token type, as returned by yylex.
- typedef token::yytokentype token_type;
-
/// Symbol type: an internal symbol number.
typedef int symbol_number_type;
@@ -300,6 +291,8 @@ m4_define([b4_symbol_type_declare],
/// Type access provider for token (enum) based symbols.
struct by_type
{
+ ]b4_token_enums[
+
/// Default constructor.
by_type ();
@@ -322,7 +315,7 @@ m4_define([b4_symbol_type_declare],
/// \a empty when empty.
symbol_number_type type_get () const YY_NOEXCEPT;
- /// The token.
+ /// The external token number.
token_type token () const YY_NOEXCEPT;
/// The symbol type.
@@ -332,17 +325,26 @@ m4_define([b4_symbol_type_declare],
};
/// "External" symbols: returned by the scanner.
- struct symbol_type : basic_symbol<by_type>
+ struct token : basic_symbol<by_type>
{]b4_variant_if([[
/// Superclass.
typedef basic_symbol<by_type> super_type;
+ /// Backward compatibility.
+ typedef token_type yytokentype;
+
/// Empty symbol.
- symbol_type () {};
+ token () {};
/// Constructor for valueless symbols, and symbols from each type.
]b4_type_foreach([_b4_token_constructor_declare])dnl
])[};
+
+ /// (External) token type, as returned by yylex.
+ typedef token::token_type token_type;
+
+ /// Backward compatible alias.
+ typedef token symbol_type;
]])
diff --git a/data/glr.cc b/data/glr.cc
index 0401b849..d183f003 100644
--- a/data/glr.cc
+++ b/data/glr.cc
@@ -267,6 +267,15 @@ b4_percent_code_get([[requires]])[
class ]b4_parser_class_name[
{
public:
+ /// Tokens.
+ struct token
+ {
+ ]b4_token_enums[
+
+ /// Backward compatibility.
+ typedef token_type yytokentype;
+ };
+
]b4_public_types_declare[
/// Build a parser object.
diff --git a/data/lalr1.cc b/data/lalr1.cc
index 21ec144f..36f01887 100644
--- a/data/lalr1.cc
+++ b/data/lalr1.cc
@@ -127,7 +127,7 @@ b4_dollar_popdef[]dnl
m4_define([b4_lex],
[b4_token_ctor_if(
[b4_function_call([yylex],
- [symbol_type], m4_ifdef([b4_lex_param], b4_lex_param))],
+ [token], m4_ifdef([b4_lex_param], b4_lex_param))],
[b4_function_call([yylex], [int],
[b4_api_PREFIX[STYPE*], [&yyla.value]][]dnl
b4_locations_if([, [[location*], [&yyla.location]]])dnl
@@ -230,7 +230,7 @@ m4_define([b4_shared_declarations],
/// \param yystate the state where the error occurred.
/// \param yyla the lookahead token.
virtual std::string yysyntax_error_ (state_type yystate,
- const symbol_type& yyla) const;
+ const token& yyla) const;
/// Compute post-reduction state.
/// \param yystate the current state
@@ -331,7 +331,7 @@ m4_define([b4_shared_declarations],
/// Move or copy construction.
stack_symbol_type (YY_RVREF (stack_symbol_type) that);
/// Steal the contents from \a sym to build this.
- stack_symbol_type (state_type s, YY_MOVE_REF (symbol_type) sym);
+ stack_symbol_type (state_type s, YY_MOVE_REF (token) sym);
#if YY_CPLUSPLUS < 201103L
/// Assignment, needed by push_back by some old implementations.
/// Moves the contents of that.
@@ -358,7 +358,7 @@ m4_define([b4_shared_declarations],
/// \param s the state
/// \param sym the symbol (for its value and location).
/// \warning the contents of \a sym.value is stolen.
- void yypush_ (const char* m, state_type s, YY_MOVE_REF (symbol_type) sym);
+ void yypush_ (const char* m, state_type s, YY_MOVE_REF (token) sym);
/// Pop \a n symbols from the stack.
void yypop_ (int n = 1);
@@ -614,7 +614,7 @@ m4_if(b4_prefix, [yy], [],
#endif
}
- ]b4_parser_class_name[::stack_symbol_type::stack_symbol_type (state_type s,
YY_MOVE_REF (symbol_type) that)
+ ]b4_parser_class_name[::stack_symbol_type::stack_symbol_type (state_type s,
YY_MOVE_REF (token) that)
: super_type (s]b4_variant_if([], [, YY_MOVE
(that.value)])[]b4_locations_if([, YY_MOVE (that.location)])[)
{]b4_variant_if([
b4_symbol_variant([that.type_get ()],
@@ -679,12 +679,12 @@ m4_if(b4_prefix, [yy], [],
}
void
- ]b4_parser_class_name[::yypush_ (const char* m, state_type s, YY_MOVE_REF
(symbol_type) sym)
+ ]b4_parser_class_name[::yypush_ (const char* m, state_type s, YY_MOVE_REF
(token) tok)
{
#if 201103L <= YY_CPLUSPLUS
- yypush_ (m, stack_symbol_type (s, std::move (sym)));
+ yypush_ (m, stack_symbol_type (s, std::move (tok)));
#else
- stack_symbol_type ss (s, sym);
+ stack_symbol_type ss (s, tok);
yypush_ (m, ss);
#endif
}
@@ -763,7 +763,7 @@ m4_if(b4_prefix, [yy], [],
int yyerrstatus_ = 0;
/// The lookahead symbol.
- symbol_type yyla;]b4_locations_if([[
+ token yyla;]b4_locations_if([[
/// The locations where the error started and ended.
stack_symbol_type yyerror_range[3];]])[
@@ -819,7 +819,7 @@ b4_dollar_popdef])[]dnl
try
#endif // YY_EXCEPTIONS
{]b4_token_ctor_if([[
- symbol_type yylookahead (]b4_lex[);
+ token yylookahead (]b4_lex[);
yyla.move (yylookahead);]], [[
yyla.type = yytranslate_ (]b4_lex[);]])[
}
@@ -1083,8 +1083,8 @@ b4_dollar_popdef])[]dnl
// Generate an error message.
std::string
]b4_parser_class_name[::yysyntax_error_ (]dnl
-b4_error_verbose_if([state_type yystate, const symbol_type& yyla],
- [state_type, const symbol_type&])[) const
+b4_error_verbose_if([state_type yystate, const token& yyla],
+ [state_type, const token&])[) const
{]b4_error_verbose_if([[
// Number of reported tokens (one for the "unexpected", one per
// "expected").
diff --git a/data/variant.hh b/data/variant.hh
index 545060e3..07f59315 100644
--- a/data/variant.hh
+++ b/data/variant.hh
@@ -352,14 +352,14 @@ m4_define([_b4_token_maker_declare],
[b4_token_visible_if([$1],
[#if 201103L <= YY_CPLUSPLUS
static
- symbol_type
+ token
make_[]_b4_symbol([$1], [id]) (b4_join(
b4_symbol_if([$1], [has_type],
[b4_symbol([$1], [type]) v]),
b4_locations_if([location_type l])));
#else
static
- symbol_type
+ token
make_[]_b4_symbol([$1], [id]) (b4_join(
b4_symbol_if([$1], [has_type],
[const b4_symbol([$1], [type])& v]),
@@ -375,13 +375,13 @@ m4_define([_b4_token_maker_declare],
m4_define([_b4_token_constructor_declare],
[m4_ifval(_b4_includes_tokens($@),
[#if 201103L <= YY_CPLUSPLUS
- symbol_type (b4_join(
+ token (b4_join(
[int tok],
b4_symbol_if([$1], [has_type],
[b4_symbol([$1], [type]) v]),
b4_locations_if([location_type l])));
#else
- symbol_type (b4_join(
+ token (b4_join(
[int tok],
b4_symbol_if([$1], [has_type],
[const b4_symbol([$1], [type])& v]),
@@ -406,27 +406,27 @@ m4_define([_b4_token_maker_define],
[b4_token_visible_if([$1],
[#if 201103L <= YY_CPLUSPLUS
inline
- b4_parser_class_name::symbol_type
+ b4_parser_class_name::token
b4_parser_class_name::make_[]_b4_symbol([$1], [id]) (b4_join(
b4_symbol_if([$1], [has_type],
[b4_symbol([$1], [type]) v]),
b4_locations_if([location_type l])))
{
- return symbol_type (b4_join([token::b4_symbol([$1], [id])],
- b4_symbol_if([$1], [has_type], [std::move
(v)]),
- b4_locations_if([std::move (l)])));
+ return {b4_join([token::b4_symbol([$1], [id])],
+ b4_symbol_if([$1], [has_type], [std::move (v)]),
+ b4_locations_if([std::move (l)]))};
}
#else
inline
- b4_parser_class_name::symbol_type
+ b4_parser_class_name::token
b4_parser_class_name::make_[]_b4_symbol([$1], [id]) (b4_join(
b4_symbol_if([$1], [has_type],
[const b4_symbol([$1], [type])& v]),
b4_locations_if([const location_type& l])))
{
- return symbol_type (b4_join([token::b4_symbol([$1], [id])],
- b4_symbol_if([$1], [has_type], [v]),
- b4_locations_if([l])));
+ return token (b4_join([token::b4_symbol([$1], [id])],
+ b4_symbol_if([$1], [has_type], [v]),
+ b4_locations_if([l])));
}
#endif
])])
@@ -446,7 +446,7 @@ m4_define([_b4_token_constructor_define],
[m4_ifval(_b4_includes_tokens($@),
[[#if 201103L <= YY_CPLUSPLUS
inline
- ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+ ]b4_parser_class_name[::token::token (]b4_join(
[int tok],
b4_symbol_if([$1], [has_type],
[b4_symbol([$1], [type]) v]),
@@ -459,7 +459,7 @@ m4_define([_b4_token_constructor_define],
}
#else
inline
- ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+ ]b4_parser_class_name[::token::token (]b4_join(
[int tok],
b4_symbol_if([$1], [has_type],
[const b4_symbol([$1], [type])& v]),
diff --git a/tests/c++.at b/tests/c++.at
index ee2e30cd..85c1fb6d 100644
--- a/tests/c++.at
+++ b/tests/c++.at
@@ -200,6 +200,7 @@ AT_PARSER_CHECK([./list], 0, [],
AT_BISON_OPTION_POPDEFS
AT_CLEANUP
+
## --------------------------------------------------- ##
## Multiple occurrences of $n and api.value.automove. ##
## --------------------------------------------------- ##
@@ -1315,7 +1316,13 @@ int yylex (yy::parser::semantic_type *lvalp)
// bug with a macro that erroneously expanded this identifier to
// yystackp->yyval.
YYUSE (lvalp);
- return yy::parser::token::ZERO;
+
+ // Check that yy::parser::token::yytokentype works. It was never documented,
+ // but it appears people have been depending on it, instead of using
+ // yy::parser::token_type.
+ yy::parser::token::yytokentype res = yy::parser::token::ZERO;
+
+ return res;
}
void yy::parser::error (std::string const&)
commit 3d0688ac6831984ecfa3d75eddb05d31a09d8c98
Author: Akim Demaille <address@hidden>
Date: Sun Dec 23 10:11:47 2018 +0100
WIP: tests: move to token instead of symbol_type.
* #: .
* #: .
diff --git a/tests/c++.at b/tests/c++.at
index 85c1fb6d..49e48693 100644
--- a/tests/c++.at
+++ b/tests/c++.at
@@ -145,10 +145,10 @@ exp: "int" { $$.push_back ($1); }
int main()
{
using yy::parser;
- // symbol_type: construction, accessor.
+ // token: construction, accessor.
{
- parser::symbol_type s = parser::make_INT(12);
- std::cerr << s.value.as<int>() << '\n';
+ parser::token t = parser::make_INT(12);
+ std::cerr << t.value.as<int>() << '\n';
}
// stack_symbol_type: construction, accessor.
@@ -156,8 +156,8 @@ int main()
#if defined __cplusplus && 201103L <= __cplusplus
auto ss = parser::stack_symbol_type(1, parser::make_INT(123));
#else
- parser::symbol_type s = parser::make_INT(123);
- parser::stack_symbol_type ss(1, s);
+ parser::token t = parser::make_INT(123);
+ parser::stack_symbol_type ss(1, t);
#endif
std::cerr << ss.value.as<int>() << '\n';
}
@@ -175,8 +175,8 @@ int main()
st.push(parser::stack_symbol_type{int_reduction_state,
parser::make_INT (i)});
#else
- parser::symbol_type s = parser::make_INT (i);
- parser::stack_symbol_type ss (int_reduction_state, s);
+ parser::token t = parser::make_INT (i);
+ parser::stack_symbol_type ss (int_reduction_state, t);
st.push (ss);
#endif
}
@@ -568,7 +568,7 @@ AT_DATA_GRAMMAR([[input.y]],
#include <iostream>
namespace yy
{
- static yy::parser::symbol_type yylex();
+ static yy::parser::token yylex();
}
}
@@ -590,7 +590,7 @@ expr:
%%
namespace yy
{
- parser::symbol_type yylex()
+ parser::token yylex()
{
static int loc = 0;
switch (loc++)
diff --git a/tests/local.at b/tests/local.at
index 146ed47b..67ad7755 100644
--- a/tests/local.at
+++ b/tests/local.at
@@ -252,7 +252,7 @@ AT_TOKEN_CTOR_IF(
[m4_pushdef([AT_LOC], [[(]AT_NAME_PREFIX[lloc)]])
m4_pushdef([AT_VAL], [[(]AT_NAME_PREFIX[lval)]])
m4_pushdef([AT_YYLEX_FORMALS], [])
- m4_pushdef([AT_YYLEX_RETURN], [yy::parser::symbol_type])
+ m4_pushdef([AT_YYLEX_RETURN], [yy::parser::token])
m4_pushdef([AT_YYLEX_ARGS], [])
m4_pushdef([AT_USE_LEX_ARGS], [])
m4_pushdef([AT_YYLEX_PRE_FORMALS], [])
diff --git a/tests/types.at b/tests/types.at
index e41c21b1..0ac03843 100644
--- a/tests/types.at
+++ b/tests/types.at
@@ -299,11 +299,11 @@ m4_foreach([b4_skel], [[yacc.c], [glr.c], [lalr1.cc],
[glr.cc]],
<< $2.first << ':' << $2.second << '\n';
}],
["12"],
- [[typedef yy::parser::symbol_type symbol;
+ [[typedef yy::parser::token token;
if (res)
- return symbol (res, std::make_pair (res - '0', res - '0' + 1));
+ return token (res, std::make_pair (res - '0', res - '0' + 1));
else
- return symbol (res)]],
+ return token (res)]],
[1:2, 2:3])
# Move-only types, and variadic emplace.
- RIP: c++: merge symbol_type and token,
Akim Demaille <=