[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Dynamic token kinds
From: |
Akim Demaille |
Subject: |
Re: Dynamic token kinds |
Date: |
Sat, 22 Dec 2018 14:59:29 +0100 |
Hi Frank!
> Le 22 déc. 2018 à 01:14, Frank Heckenbach <address@hidden> a écrit :
>
> Akim Demaille wrote:
>
>> I like this idea. I have a draft for it in my repo, as "make-symbol".
>> Please, try it and report about it.
>
> Again, sorry for the delay (still busy), but now I tried it
> (removing the "b4_parse_assert_if", see below).
Thanks for spending time on this. I really need feedback :)
> It seems to work for me. The only issue I had was due to sloppiness
> on my side. I'm only mentioning it in case others do the same.
> Basically, I had stored tokens of one specific semantic type in a
> look-up table together with tokens without semantic type (storing a
> dummy value in the table for the latter), and constructed the tokens
> for both in the same branch, exploiting the only case where a
> mismatch is inconsequential, i.e. setting a value and not using it.
> This worked before, but the stricter checks now (correctly) caught
> it.
Great news! It works!
> To actually allow this, you could have the typed constructors all
> accept the typeless tokens as well, but I don't consider that really
> necessary. Unless you want to support that for backward (bugward?)
> compatibility, I'll just change my code to make two separate
> make_symbol calls.
Yes, I prefer it this way. The whole point of my work on C++'s
symbols so far is really to be type safe(r).
>> There are a few issues:
>> - make_symbol will collide if the user has a token named symbol
>> Any idea of a better name?
>
> To avoid such collisions, I think we have to avoid the "make_"
> prefix entirely. Maybe "build_symbol"?
>
>> Or simply make them actual constructors for `symbol_type`.
>
> Yes, if they are (documented as) public. I think I'd prefer this as
> I wouldn't have to change my code from 3.2.2.
See below, I have a working draft that completely replaces
make_symbol by "merging" the assert-based type checking into
the symbol_type constructors. Since that makes the ctors safe,
I'm fine with exposing them.
I wish it required less changes. In particular,
it tears appart symbol_type and stack_symbol_type even further
apart. My CRTP might no longer fully make sense, maybe I'll
get rid of it at some point.
>> - In the signature of make_symbol, I've had to use int for the
>> token type, instead of the enum token_type, and then
>> to convert the int into token_type. I don't like that, but I
>> probably don't have much of a choice. (A template would be
>> overkill IMHO).
>
> Why, is it because of char tokens (like '+' in your example)?
Yes, exactly. I don't like that we accept ASCII, but we have
to.
Here is the patch, twice. I want to keep the previous one
(with make_symbol) in the git history, so the second patch below
shows that actual commit, relatively to make_symbol. But I think
the patch compared to before make_symbol is a better reading (it's
the first one below).
Now that I have done this, I think I can merge two different
types that currently exist in yy::parser: token and symbol_type.
The first one is used _only_ to define the enum of the various
token types, and the second one, well, implements the tokens. And
'token' is actually a better name than 'symbol_type'.
Of course, I will leave a type alias for symbol_type.
WDYT?
Currently in https://github.com/akimd/bison/tree/make-symbol.
commit 5e8571708e34b3e69a8182a88057199e7bf63568
Author: Akim Demaille <address@hidden>
Date: Wed Dec 19 17:51:10 2018 +0100
c++: exhibit a safe symbol_type
Instead of introducing make_symbol (whose name, btw, somewhat
infringes on the user's "name space", if she defines a token named
"symbol"), let's make the construction of symbol_type safer, using
assertions.
For instance with:
%token ':' <std::string> ID <int> INT;
generate:
symbol_type (int token, const std::string&);
symbol_type (int token, const int&);
symbol_type (int token);
It does mean that now named token constructors (make_ID, make_INT,
etc.) go through a useless assert, but I think we can ignore this: I
assume any decent compiler will inline the symbol_type ctor inside the
make_TOKEN functions, which will show that the assert is trivially
verified, hence I expect no code will be emitted for it. And anyway,
that's an assert, NDEBUG controls it.
* data/c++.m4 (symbol_type): Turn into a subclass of
basic_symbol<by_type>.
Declare symbol constructors when variants are enabled.
* data/variant.hh (_b4_type_constructor_declare)
(_b4_type_constructor_define): Replace with...
(_b4_symbol_constructor_declare, _b4_symbol_constructor_def): these.
Generate symbol_type constructors.
* doc/bison.texi (Complete Symbols): Document.
* tests/types.at: Check.
diff --git a/NEWS b/NEWS
index 08d99f19..c67fb142 100644
--- a/NEWS
+++ b/NEWS
@@ -96,10 +96,36 @@ GNU Bison NEWS
until it sees the '='. So we notate the two possible reductions to
indicate that each conflicts in one rule.
+*** C++: Actual token constructors
+
+ When variants and token constructors are enabled, in addition to the
+ type-safe named token constructors (make_ID, amke_INT, etc.), we now
+ generate genuine constructors for symbol_type.
+
+ For instance with these declarations
+
+ %token ':'
+ <std::string> ID
+ <int> INT;
+
+ you may use these constructors:
+
+ symbol_type (int token, const std::string&);
+ symbol_type (int token, const int&);
+ symbol_type (int token);
+
+ which should be used in a Flex-scanner as follows.
+
+ %%
+ [a-z]+ return yy::parser::symbol_type (ID, yytext);
+ [0-9]+ return yy::parser::symbol_type (INT, text_to_int (yytext);
+ ":" return yy::parser::symbol_type (’:’);
+ <<EOF>> return yy::parser::symbol_type (0);
+
*** C++: Variadic emplace
- If your application requires C++11, you may now use a variadic emplace for
- semantic values:
+ If your application requires C++11 and you don't use symbol constructors,
+ you may now use a variadic emplace for semantic values:
%define api.value.type variant
%token <std::pair<int, int>> PAIR
diff --git a/data/c++.m4 b/data/c++.m4
index b4f56add..eb5c47f0 100644
--- a/data/c++.m4
+++ b/data/c++.m4
@@ -332,7 +332,17 @@ m4_define([b4_symbol_type_declare],
};
/// "External" symbols: returned by the scanner.
- typedef basic_symbol<by_type> symbol_type;
+ struct symbol_type : basic_symbol<by_type>
+ {]b4_variant_if([[
+ /// Superclass.
+ typedef basic_symbol<by_type> super_type;
+
+ /// Empty symbol.
+ symbol_type () {};
+
+ /// Constructor for valueless symbols, and symbols from each type.
+]b4_type_foreach([_b4_symbol_constructor_declare])[
+ ]])[};
]])
diff --git a/data/variant.hh b/data/variant.hh
index 836616a6..4e036d1e 100644
--- a/data/variant.hh
+++ b/data/variant.hh
@@ -335,6 +335,16 @@ m4_define([b4_symbol_value_template],
## ------------- ##
+# _b4_includes_tokens(SYMBOL-NUM...)
+# ----------------------------------
+# Expands to non-empty iff one of the SYMBOL-NUM denotes
+# a token.
+m4_define([_b4_is_token],
+ [b4_symbol_if([$1], [is_token], [1])])
+m4_define([_b4_includes_tokens],
+ [m4_map([_b4_is_token], address@hidden)])
+
+
# _b4_token_maker_declare(SYMBOL-NUM)
# -----------------------------------
# Declare make_SYMBOL for SYMBOL-NUM. Use at class-level.
@@ -358,10 +368,31 @@ m4_define([_b4_token_maker_declare],
])])
+# _b4_symbol_constructor_declare(SYMBOL-NUM...)
+# ---------------------------------------------
+# Declare a unique make_symbol for all the SYMBOL-NUM (they
+# have the same type). Use at class-level.
+m4_define([_b4_symbol_constructor_declare],
+[m4_ifval(_b4_includes_tokens($@),
+[#if 201103L <= YY_CPLUSPLUS
+ symbol_type (b4_join(
+ [int tok],
+ b4_symbol_if([$1], [has_type],
+ [b4_symbol([$1], [type]) v]),
+ b4_locations_if([location_type l])));
+#else
+ symbol_type (b4_join(
+ [int tok],
+ b4_symbol_if([$1], [has_type],
+ [const b4_symbol([$1], [type])& v]),
+ b4_locations_if([const location_type& l])));
+#endif
+])])
+
+
# b4_symbol_constructor_declare
# -----------------------------
-# Declare symbol constructors for all the value types.
-# Use at class-level.
+# Declare symbol constructors. Use at class-level.
m4_define([b4_symbol_constructor_declare],
[ // Symbol constructors declarations.
b4_symbol_foreach([_b4_token_maker_declare])])
@@ -401,6 +432,48 @@ m4_define([_b4_token_maker_define],
])])
+# _b4_symbol_constructor_define(SYMBOL-NUM...)
+# --------------------------------------------
+# Declare a unique make_symbol for all the SYMBOL-NUM (they
+# have the same type). Use at class-level.
+m4_define([_b4_type_clause],
+[b4_symbol_if([$1], [is_token],
+ [b4_symbol_if([$1], [has_id],
+ [tok == token::b4_symbol([$1], [id])],
+ [tok == b4_symbol([$1], [user_number])])])])
+
+m4_define([_b4_symbol_constructor_define],
+[m4_ifval(_b4_includes_tokens($@),
+[[#if 201103L <= YY_CPLUSPLUS
+ inline
+ ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+ [int tok],
+ b4_symbol_if([$1], [has_type],
+ [b4_symbol([$1], [type]) v]),
+ b4_locations_if([location_type l]))[)
+ : super_type(]b4_join([token_type (tok)],
+ b4_symbol_if([$1], [has_type], [std::move (v)]),
+ b4_locations_if([std::move (l)]))[)
+ {
+ YYASSERT (]m4_join([ || ], m4_map_sep([_b4_type_clause], [, ],
address@hidden))[);
+ }
+#else
+ inline
+ ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+ [int tok],
+ b4_symbol_if([$1], [has_type],
+ [const b4_symbol([$1], [type])& v]),
+ b4_locations_if([const location_type& l]))[)
+ : super_type(]b4_join([token_type (tok)],
+ b4_symbol_if([$1], [has_type], [v]),
+ b4_locations_if([l]))[)
+ {
+ YYASSERT (]m4_join([ || ], m4_map_sep([_b4_type_clause], [, ],
address@hidden))[);
+ }
+#endif
+]])])
+
+
# b4_basic_symbol_constructor_declare(SYMBOL-NUM)
# -----------------------------------------------
# Generate a constructor declaration for basic_symbol from given type.
@@ -452,4 +525,5 @@ m4_define([b4_basic_symbol_constructor_define],
# Define the overloaded versions of make_symbol for all the value types.
m4_define([b4_symbol_constructor_define],
[ // Implementation of make_symbol for each symbol type.
+b4_type_foreach([_b4_symbol_constructor_define])
b4_symbol_foreach([_b4_token_maker_define])])
diff --git a/doc/bison.texi b/doc/bison.texi
index 89283be7..e1a5aaba 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -11500,6 +11500,57 @@ additional arguments.
For each token type, Bison generates named constructors as follows.
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int
@var{token}, const @var{value_type}& @var{value}, const location_type&
@var{location})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int
@var{token}, const location_type& @var{location})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int
@var{token}, const @var{value_type}& @var{value})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int
@var{token})
+Build a complete terminal symbol for the token type @var{token} (including
+the @code{api.token.prefix}), whose semantic value, if it has one, is
address@hidden of adequate @var{value_type}. Pass the @var{location} iff
+location tracking is enabled.
+
+Consistency between @var{token} and @var{value_type} is checked via an
address@hidden
address@hidden deftypeop
+
+For instance, given the following declarations:
+
address@hidden
+%define api.token.prefix @address@hidden
+%token <std::string> IDENTIFIER;
+%token <int> INTEGER;
+%token ':';
address@hidden example
+
address@hidden
+you may use these constructors:
+
address@hidden
+symbol_type (int token, const std::string&, const location_type&);
+symbol_type (int token, const int&, const location_type&);
+symbol_type (int token, const location_type&);
address@hidden example
+
address@hidden
+which should be used in a Flex-scanner as follows.
+
address@hidden
+%%
+[a-z]+ return yy::parser::symbol_type (TOK_IDENTIFIER, yytext, loc);
+[0-9]+ return yy::parser::symbol_type (TOK_INTEGER, text_to_int (yytext),
loc);
+":" return yy::parser::symbol_type (':', loc);
+<<EOF>> return yy::parser::symbol_type (0, loc);
address@hidden example
+
address@hidden 1
+
+Note that it is possible to generate and compile type incorrect code
+(e.g. @samp{symbol_type (':', yytext, loc)}). It will fail at run time,
+provided the assertions are enabled (i.e., @option{-DNDEBUG} was not passed
+to the compiler). Bison supports an alternative that guarantees that type
+incorrect code will not even compile. Indeed, it generates @emph{named
+constructors} as follows.
+
@deftypemethod {parser} {symbol_type} address@hidden (const @var{value_type}&
@var{value}, const location_type& @var{location})
@deftypemethodx {parser} {symbol_type} address@hidden (const location_type&
@var{location})
@deftypemethodx {parser} {symbol_type} address@hidden (const @var{value_type}&
@var{value})
@@ -11531,7 +11582,7 @@ symbol_type make_EOF (const location_type&);
@end example
@noindent
-which should be used in a Flex-scanner as follows.
+which should be used in a scanner as follows.
@example
[a-z]+ return yy::parser::make_IDENTIFIER (yytext, loc);
@@ -11544,6 +11595,7 @@ Tokens that do not have an identifier are not
accessible: you cannot simply
use characters such as @code{':'}, they must be declared with @code{%token},
including the end-of-file token.
+
@node A Complete C++ Example
@subsection A Complete C++ Example
diff --git a/tests/types.at b/tests/types.at
index 2924ec18..e41c21b1 100644
--- a/tests/types.at
+++ b/tests/types.at
@@ -288,6 +288,24 @@ m4_foreach([b4_skel], [[yacc.c], [glr.c], [lalr1.cc],
[glr.cc]],
AT_VAL.build (std::pair<std::string, std::string> ("two",
"deux"));],
[10:11, two:deux])
+ # Type-based token constructors on move-only types, and types with commas.
+ AT_TEST([%skeleton "]b4_skel["
+ %define api.value.type variant
+ %define api.token.constructor],
+ [[%token <std::pair<int, int>> '1' '2';]],
+ ['1' '2'
+ {
+ std::cout << $1.first << ':' << $1.second << ", "
+ << $2.first << ':' << $2.second << '\n';
+ }],
+ ["12"],
+ [[typedef yy::parser::symbol_type symbol;
+ if (res)
+ return symbol (res, std::make_pair (res - '0', res - '0' + 1));
+ else
+ return symbol (res)]],
+ [1:2, 2:3])
+
# Move-only types, and variadic emplace.
AT_TEST([%skeleton "]b4_skel["
%code requires { #include <memory> }
@@ -325,6 +343,25 @@ m4_foreach([b4_skel], [[yacc.c], [glr.c], [lalr1.cc],
[glr.cc]],
[10, 21:22],
[AT_REQUIRE_CXX_STD(14, [echo "$at_std not supported"; continue])])
+ # Type-based token constructors on move-only types, and types with commas.
+ AT_TEST([%skeleton "]b4_skel["
+ %code requires { #include <memory> }
+ %define api.value.type variant
+ %define api.token.constructor],
+ [[%token <std::unique_ptr<int>> '1';
+ %token <std::pair<int, int>> '2';]],
+ ['1' '2' { std::cout << *$1 << ", "
+ << $2.first << ':' << $2.second << '\n'; }],
+ ["12"],
+ [[if (res == '1')
+ return {res, std::make_unique<int> (10)};
+ else if (res == '2')
+ return {res, std::make_pair (21, 22)};
+ else
+ return res]],
+ [10, 21:22],
+ [AT_REQUIRE_CXX_STD(14, [echo "$at_std not supported"; continue])])
+
])
])
commit 5e8571708e34b3e69a8182a88057199e7bf63568
Author: Akim Demaille <address@hidden>
Date: Wed Dec 19 17:51:10 2018 +0100
c++: exhibit a safe symbol_type
Instead of introducing make_symbol (whose name, btw, somewhat
infringes on the user's "name space", if she defines a token named
"symbol"), let's make the construction of symbol_type safer, using
assertions.
For instance with:
%token ':' <std::string> ID <int> INT;
generate:
symbol_type (int token, const std::string&);
symbol_type (int token, const int&);
symbol_type (int token);
It does mean that now named token constructors (make_ID, make_INT,
etc.) go through a useless assert, but I think we can ignore this: I
assume any decent compiler will inline the symbol_type ctor inside the
make_TOKEN functions, which will show that the assert is trivially
verified, hence I expect no code will be emitted for it. And anyway,
that's an assert, NDEBUG controls it.
* data/c++.m4 (symbol_type): Turn into a subclass of
basic_symbol<by_type>.
Declare symbol constructors when variants are enabled.
* data/variant.hh (_b4_type_constructor_declare)
(_b4_type_constructor_define): Replace with...
(_b4_symbol_constructor_declare, _b4_symbol_constructor_def): these.
Generate symbol_type constructors.
* doc/bison.texi (Complete Symbols): Document.
* tests/types.at: Check.
diff --git a/NEWS b/NEWS
index 08d99f19..c67fb142 100644
--- a/NEWS
+++ b/NEWS
@@ -96,10 +96,36 @@ GNU Bison NEWS
until it sees the '='. So we notate the two possible reductions to
indicate that each conflicts in one rule.
+*** C++: Actual token constructors
+
+ When variants and token constructors are enabled, in addition to the
+ type-safe named token constructors (make_ID, amke_INT, etc.), we now
+ generate genuine constructors for symbol_type.
+
+ For instance with these declarations
+
+ %token ':'
+ <std::string> ID
+ <int> INT;
+
+ you may use these constructors:
+
+ symbol_type (int token, const std::string&);
+ symbol_type (int token, const int&);
+ symbol_type (int token);
+
+ which should be used in a Flex-scanner as follows.
+
+ %%
+ [a-z]+ return yy::parser::symbol_type (ID, yytext);
+ [0-9]+ return yy::parser::symbol_type (INT, text_to_int (yytext);
+ ":" return yy::parser::symbol_type (’:’);
+ <<EOF>> return yy::parser::symbol_type (0);
+
*** C++: Variadic emplace
- If your application requires C++11, you may now use a variadic emplace for
- semantic values:
+ If your application requires C++11 and you don't use symbol constructors,
+ you may now use a variadic emplace for semantic values:
%define api.value.type variant
%token <std::pair<int, int>> PAIR
diff --git a/data/c++.m4 b/data/c++.m4
index b4f56add..eb5c47f0 100644
--- a/data/c++.m4
+++ b/data/c++.m4
@@ -332,7 +332,17 @@ m4_define([b4_symbol_type_declare],
};
/// "External" symbols: returned by the scanner.
- typedef basic_symbol<by_type> symbol_type;
+ struct symbol_type : basic_symbol<by_type>
+ {]b4_variant_if([[
+ /// Superclass.
+ typedef basic_symbol<by_type> super_type;
+
+ /// Empty symbol.
+ symbol_type () {};
+
+ /// Constructor for valueless symbols, and symbols from each type.
+]b4_type_foreach([_b4_symbol_constructor_declare])[
+ ]])[};
]])
diff --git a/data/variant.hh b/data/variant.hh
index 22832248..4e036d1e 100644
--- a/data/variant.hh
+++ b/data/variant.hh
@@ -368,25 +368,21 @@ m4_define([_b4_token_maker_declare],
])])
-# _b4_type_constructor_declare(SYMBOL-NUM...)
-# -------------------------------------------
+# _b4_symbol_constructor_declare(SYMBOL-NUM...)
+# ---------------------------------------------
# Declare a unique make_symbol for all the SYMBOL-NUM (they
# have the same type). Use at class-level.
-m4_define([_b4_type_constructor_declare],
+m4_define([_b4_symbol_constructor_declare],
[m4_ifval(_b4_includes_tokens($@),
[#if 201103L <= YY_CPLUSPLUS
- static
- symbol_type
- make_symbol (dnl
-b4_join([int tok],
+ symbol_type (b4_join(
+ [int tok],
b4_symbol_if([$1], [has_type],
[b4_symbol([$1], [type]) v]),
b4_locations_if([location_type l])));
#else
- static
- symbol_type
- make_symbol (dnl
-b4_join([int tok],
+ symbol_type (b4_join(
+ [int tok],
b4_symbol_if([$1], [has_type],
[const b4_symbol([$1], [type])& v]),
b4_locations_if([const location_type& l])));
@@ -399,7 +395,6 @@ b4_join([int tok],
# Declare symbol constructors. Use at class-level.
m4_define([b4_symbol_constructor_declare],
[ // Symbol constructors declarations.
-b4_type_foreach([_b4_type_constructor_declare])
b4_symbol_foreach([_b4_token_maker_declare])])
@@ -437,8 +432,8 @@ m4_define([_b4_token_maker_define],
])])
-# _b4_type_constructor_define(SYMBOL-NUM...)
-# ------------------------------------------
+# _b4_symbol_constructor_define(SYMBOL-NUM...)
+# --------------------------------------------
# Declare a unique make_symbol for all the SYMBOL-NUM (they
# have the same type). Use at class-level.
m4_define([_b4_type_clause],
@@ -447,38 +442,36 @@ m4_define([_b4_type_clause],
[tok == token::b4_symbol([$1], [id])],
[tok == b4_symbol([$1], [user_number])])])])
-m4_define([_b4_type_constructor_define],
+m4_define([_b4_symbol_constructor_define],
[m4_ifval(_b4_includes_tokens($@),
-[#if 201103L <= YY_CPLUSPLUS
+[[#if 201103L <= YY_CPLUSPLUS
inline
- b4_parser_class_name::symbol_type
- b4_parser_class_name::make_symbol (dnl
-b4_join([int tok],
+ ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+ [int tok],
b4_symbol_if([$1], [has_type],
[b4_symbol([$1], [type]) v]),
- b4_locations_if([location_type l])))
- {b4_parse_assert_if([
- assert (m4_join([ || ], m4_map_sep([_b4_type_clause], [, ],
address@hidden)));])[
- return symbol_type (]b4_join([token_type (tok)],
- b4_symbol_if([$1], [has_type], [std::move
(v)]),
- b4_locations_if([std::move (l)])));
+ b4_locations_if([location_type l]))[)
+ : super_type(]b4_join([token_type (tok)],
+ b4_symbol_if([$1], [has_type], [std::move (v)]),
+ b4_locations_if([std::move (l)]))[)
+ {
+ YYASSERT (]m4_join([ || ], m4_map_sep([_b4_type_clause], [, ],
address@hidden))[);
}
#else
inline
- b4_parser_class_name::symbol_type
- b4_parser_class_name::make_symbol (dnl
-b4_join([int tok],
+ ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+ [int tok],
b4_symbol_if([$1], [has_type],
[const b4_symbol([$1], [type])& v]),
- b4_locations_if([const location_type& l])))
- {b4_parse_assert_if([
- assert (m4_join([ || ], m4_map_sep([_b4_type_clause], [, ],
address@hidden)));])[
- return symbol_type (]b4_join([token_type (tok)],
- b4_symbol_if([$1], [has_type], [v]),
- b4_locations_if([l])));
+ b4_locations_if([const location_type& l]))[)
+ : super_type(]b4_join([token_type (tok)],
+ b4_symbol_if([$1], [has_type], [v]),
+ b4_locations_if([l]))[)
+ {
+ YYASSERT (]m4_join([ || ], m4_map_sep([_b4_type_clause], [, ],
address@hidden))[);
}
#endif
-])])
+]])])
# b4_basic_symbol_constructor_declare(SYMBOL-NUM)
@@ -532,5 +525,5 @@ m4_define([b4_basic_symbol_constructor_define],
# Define the overloaded versions of make_symbol for all the value types.
m4_define([b4_symbol_constructor_define],
[ // Implementation of make_symbol for each symbol type.
-b4_type_foreach([_b4_type_constructor_define])
+b4_type_foreach([_b4_symbol_constructor_define])
b4_symbol_foreach([_b4_token_maker_define])])
diff --git a/doc/bison.texi b/doc/bison.texi
index 89283be7..e1a5aaba 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -11500,6 +11500,57 @@ additional arguments.
For each token type, Bison generates named constructors as follows.
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int
@var{token}, const @var{value_type}& @var{value}, const location_type&
@var{location})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int
@var{token}, const location_type& @var{location})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int
@var{token}, const @var{value_type}& @var{value})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int
@var{token})
+Build a complete terminal symbol for the token type @var{token} (including
+the @code{api.token.prefix}), whose semantic value, if it has one, is
address@hidden of adequate @var{value_type}. Pass the @var{location} iff
+location tracking is enabled.
+
+Consistency between @var{token} and @var{value_type} is checked via an
address@hidden
address@hidden deftypeop
+
+For instance, given the following declarations:
+
address@hidden
+%define api.token.prefix @address@hidden
+%token <std::string> IDENTIFIER;
+%token <int> INTEGER;
+%token ':';
address@hidden example
+
address@hidden
+you may use these constructors:
+
address@hidden
+symbol_type (int token, const std::string&, const location_type&);
+symbol_type (int token, const int&, const location_type&);
+symbol_type (int token, const location_type&);
address@hidden example
+
address@hidden
+which should be used in a Flex-scanner as follows.
+
address@hidden
+%%
+[a-z]+ return yy::parser::symbol_type (TOK_IDENTIFIER, yytext, loc);
+[0-9]+ return yy::parser::symbol_type (TOK_INTEGER, text_to_int (yytext),
loc);
+":" return yy::parser::symbol_type (':', loc);
+<<EOF>> return yy::parser::symbol_type (0, loc);
address@hidden example
+
address@hidden 1
+
+Note that it is possible to generate and compile type incorrect code
+(e.g. @samp{symbol_type (':', yytext, loc)}). It will fail at run time,
+provided the assertions are enabled (i.e., @option{-DNDEBUG} was not passed
+to the compiler). Bison supports an alternative that guarantees that type
+incorrect code will not even compile. Indeed, it generates @emph{named
+constructors} as follows.
+
@deftypemethod {parser} {symbol_type} address@hidden (const @var{value_type}&
@var{value}, const location_type& @var{location})
@deftypemethodx {parser} {symbol_type} address@hidden (const location_type&
@var{location})
@deftypemethodx {parser} {symbol_type} address@hidden (const @var{value_type}&
@var{value})
@@ -11531,7 +11582,7 @@ symbol_type make_EOF (const location_type&);
@end example
@noindent
-which should be used in a Flex-scanner as follows.
+which should be used in a scanner as follows.
@example
[a-z]+ return yy::parser::make_IDENTIFIER (yytext, loc);
@@ -11544,6 +11595,7 @@ Tokens that do not have an identifier are not
accessible: you cannot simply
use characters such as @code{':'}, they must be declared with @code{%token},
including the end-of-file token.
+
@node A Complete C++ Example
@subsection A Complete C++ Example
diff --git a/tests/types.at b/tests/types.at
index bead23d0..e41c21b1 100644
--- a/tests/types.at
+++ b/tests/types.at
@@ -288,6 +288,24 @@ m4_foreach([b4_skel], [[yacc.c], [glr.c], [lalr1.cc],
[glr.cc]],
AT_VAL.build (std::pair<std::string, std::string> ("two",
"deux"));],
[10:11, two:deux])
+ # Type-based token constructors on move-only types, and types with commas.
+ AT_TEST([%skeleton "]b4_skel["
+ %define api.value.type variant
+ %define api.token.constructor],
+ [[%token <std::pair<int, int>> '1' '2';]],
+ ['1' '2'
+ {
+ std::cout << $1.first << ':' << $1.second << ", "
+ << $2.first << ':' << $2.second << '\n';
+ }],
+ ["12"],
+ [[typedef yy::parser::symbol_type symbol;
+ if (res)
+ return symbol (res, std::make_pair (res - '0', res - '0' + 1));
+ else
+ return symbol (res)]],
+ [1:2, 2:3])
+
# Move-only types, and variadic emplace.
AT_TEST([%skeleton "]b4_skel["
%code requires { #include <memory> }
@@ -336,11 +354,11 @@ m4_foreach([b4_skel], [[yacc.c], [glr.c], [lalr1.cc],
[glr.cc]],
<< $2.first << ':' << $2.second << '\n'; }],
["12"],
[[if (res == '1')
- return yy::parser::make_symbol ('1', std::make_unique<int>
(10));
+ return {res, std::make_unique<int> (10)};
else if (res == '2')
- return yy::parser::make_symbol ('2', std::make_pair (21, 22));
+ return {res, std::make_pair (21, 22)};
else
- return yy::parser::make_symbol (0)]],
+ return res]],
[10, 21:22],
[AT_REQUIRE_CXX_STD(14, [echo "$at_std not supported"; continue])])
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: Dynamic token kinds,
Akim Demaille <=