[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Rename variant and lex_symbols options
From: |
Akim Demaille |
Subject: |
Re: Rename variant and lex_symbols options |
Date: |
Thu, 23 Feb 2012 17:01:42 +0100 |
Le 17 févr. 2012 à 03:57, Joel E. Denny a écrit :
> Hi Akim.
Hi Joel!
Thanks a lot for your answer. I will try not to drop the ball
this time on this regard. The more I think about it, the more
I'd be happy that we used more a ticketing system, either
that of Savannah, or the deb-bug stuff that the gnulibers seems
to be fond of.
>> * variant
>> The point of "variant" is to allow objects (not pointers
>> to objects) to be used to type the symbols in the C++
>> LR parser.
>>
>> So we have, for instance :
>>
>>> %token <::std::string> TEXT;
>>> %token <int> NUMBER;
>>>
>>> list:
>>> /* nothing */ { /* Generates an empty string list */ }
>>> | list item { std::swap ($$, $1); $$.push_back ($2); }
>>> ;
>>
>> It does have an influence on the API, since yylval can no
>> longer be used "simply". So it could be something like
>> api.symbols.variant, or api.values.variant...
>
> api seems reasonable given that it does affect the generated API exposed
> to the scanner, at least.
Yes, in the sense that it changes yystype. The test c++.at:variant
shows a number of possibilities to define yylval. There are basically
three when using variants. The first two are equivalent for bison,
but for the user there is a difference: either build with default,
and then assign:
yylval->build<std::string>() = yytext;
*yylloc = location_type ();
return token::TEXT;
or build with value.
yylval->build (yytext);
*yylloc = location_type ();
return token::TEXT;
If in addition to request lex_symbol, yylval is no longer a triple
with absolute independence of the semantical value and the token
kind, but a single object that binds the type of the semantical
value with the token type:
return yy::parser::make_TEXT (yytext, location_type ());
> I believe we had come to the conclusion that we should avoid Boolean
> variables from now on.
You are absolutely right! Thank you very much for this reminder,
I have write it down somewhere (HACKING).
> The rationale was that we so often outgrow
> true|false with some other possibility we didn't originally think of.
> Could we have api.value = union|variant? Also, notice the use of singular
> as discussed below.
That's nice. I would also be very happy to no longer
suggest to #define YYSTYPE double, but say
%define api.value custom
%define api.value.type double
or something like that. Maybe
%define api.value <double>
I don't know. BTW, we dropped the "token" part, is this
on purpose?
%define api.token.value?
Of course there are not only tokens, but also nterms.
%define api.symbol.value
>> * lex_symbols
>> The point here is to provide an API to build the symbols
>> in such a way that it is not possible to return a semantic
>> value incompatible with the token kind (e.g.,
>> [0-9]+ yylval.sval = yytext; return INTEGER;):
>>
>> instead you write:
>>
>>> [0-9]+ return yy::parser::make_INTEGER(text_to_int (yytext), loc);
>>> [a-z]+ return yy::parser::make_IDENTIFIER(yytext, loc);
>>> ":" return yy::parser::make_COLON(loc);
>>
>> Again, it has an influence on the API, so maybe
>> api.tokens.constructors (we already have api.tokens.prefix
>> which probably should have been api.token.prefix),
>
> This is my fault. My logic was to use plural when there's more than one
> of something. However, I now see that always using singular is probably a
> simpler rule to remember and not really so misleading as I thought. For
> example, lr.default-reduction, lr.keep-unreachable-state, and
> api.token.prefix would all have been fine.
So I will deprecate api.tokens.prefix to api.token.prefix, no worries.
I'll also check the others.
>> or api.token.object.
>>
>> Both are meant to be used together. Maybe actually I should
>> enforce this so that there are less combinations to check.
>
> So, lex_symbols can't be used without variant? Is it possible that might
> ever change? Sorry, I haven't studied the details.
You are right, there is nothing intrinsically that forbids this. But
then I really need more information than is currently provided to bison.
In the example above:
> [a-z]+ return yy::parser::make_IDENTIFIER(yytext, loc);
> [0-9]+ return yy::parser::make_INTEGER(text_to_int (yytext), loc);
> ":" return yy::parser::make_COLON(loc);
all these work because Bison can make the type to the token types:
> %token <::std::string> IDENTIFIER;
> %token <int> INTEGER;
> %token COLON;
With %union, I can't. So maybe I need to introduce some other
concept, which is the fact that it is not type tags that are
used, but genuine types.
Too many things for the sole api.value :(
variants requires types (not tags).
lex_symbols currently requires variants, but could require only types.
%define api.symbol.value tag // the default, <ival> denotes a field in YYSTYPE
%define api.symbol.value type // %token <int> INT
%define api.symbol.value.type union // the default, including for union
%define api.symbol.value.type variant
// Defaults to YYSTYPE
%define api.symbol.value.type.name foo // instead of #define YYSTYPE foo.
Too many options, things start to blur :(
> If lex_symbols will only ever make sense with variant, then maybe we need
> to extend the api.value enum. variant-constructor? I'm not sure.
> If that doesn't seem right, I think api.token.constructor is fine. But
> again, should we avoid the Boolean so it can grow if necessary? Maybe
> none|variant, which would at least make it clearer that it's a companion
> for api.variant.