bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] token value 0 is ignored


From: Akim Demaille
Subject: Re: [PATCH] token value 0 is ignored
Date: 29 Oct 2001 17:01:46 +0100
User-agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Artificial Intelligence)

>>>>> "Dick" == Dick Streefland <address@hidden> writes:

Dick> I tried to build openmotif, using "bison -y" instead of yacc,
Dick> but one of the tools (uil) didn't work. The problem is that the
Dick> grammar file Uil.y contains the following token definition:

Dick>   %token UILEOF 0

Dick> Bison ignores the 0, and assigns token value 257. 

Well I agree it should not renumber it, at least not silently, but
your patch is wrong, you seem to have forgotten the calling convention
with yylex, which the author of the %tokenline above didn't, according
to the name of the token:

Calling Convention for `yylex'
------------------------------

   The value that `yylex' returns must be the numeric code for the
   type of token it has just found, or 0 for end-of-input.

   When a token is referred to in the grammar rules by a name, that name
in the parser file becomes a C macro whose definition is the proper
numeric code for that token type.  So `yylex' can use the name to
indicate that type.  *Note Symbols::.

   When a token is referred to in the grammar rules by a character
literal, the numeric code for that character is also the code for the
token type.  So `yylex' can simply return that character code.  The
null character must not be used this way, because its code is zero and
that is what signifies end-of-input.

   Here is an example showing these things:

     yylex ()
     {
       ...
       if (c == EOF)     /* Detect end of file. */
         return 0;
       ...
       if (c == '+' || c == '-')
         return c;      /* Assume token type for `+' is '+'. */
       ...
       return INT;      /* Return the type of the token. */
       ...
     }

This interface has been designed so that the output from the `lex'
utility can be used without change as the definition of `yylex'.

   If the grammar uses literal string tokens, there are two ways that
`yylex' can determine the token type codes for them:

   * If the grammar defines symbolic token names as aliases for the
     literal string tokens, `yylex' can use these symbolic names like
     all others.  In this case, the use of the literal string tokens in
     the grammar file has no effect on `yylex'.

   * `yylex' can find the multi-character token in the `yytnamé table.
     The index of the token in the table is the token typés code.
     The name of a multi-character token is recorded in `yytnamé with a
     double-quote, the token's characters, and another double-quote.
     The token's characters are not escaped in any way; they appear
     verbatim in the contents of the string in the table.

     Herés code for looking up a token in `yytnamé, assuming that the
     characters of the token are stored in `token_buffer'.

          for (i = 0; i < YYNTOKENS; i++)
            {
              if (yytname[i] != 0
                  && yytname[i][0] == '"'
                  && strncmp (yytname[i] + 1, token_buffer,
                              strlen (token_buffer))
                  && yytname[i][strlen (token_buffer) + 1] == '"'
                  && yytname[i][strlen (token_buffer) + 2] == 0)
                break;
            }

     The `yytnamé table is generated only if you use the
     `%token_tablé declaration.  *Note Decl Summary::.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]