bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-bison] Bug in string-valued terminals


From: Tom Roberts
Subject: Re: [bug-bison] Bug in string-valued terminals
Date: Wed, 29 Dec 2010 00:19:36 -0600
User-agent: Thunderbird 2.0.0.14 (Macintosh/20080421)

Hi Joel (et al):

Ok. I decided to not use yytoknum[], as it is undocumented and also requires a funky "#define YYPRINT". Here is the code I am using inside yylex():

        // get map of keywords from strings in the grammar
        static map<string,int> keyword;
        if(keyword.size() == 0) {
                for(int i=0; i<YYNTOKENS; ++i) {
                        if(yytname[i][0] != '"') continue;
                        string name(yytname[i]+1);
                        name.erase(name.size()-1,1);
                        for(int j=YYMAXUTOK; j>0; --j) {
                                if(yytranslate[j] == i) {
                                        keyword[name] = j;
                                        break;
                                }
                        }
                }
        }
        // when an identifier is found that is in keyword[],
        // return keyword[name]

This is a C++ file that does #include "sxfread.tab.c" inside a namespace (generated by bison from sxfread.y). That seemed to me to be a much easier way to interface to C++ than via the C++ support in bison [#].

With this approach, this bug becomes just a documentation issue. Though it does depend on retaining %token-table (I believe yytranslate[] is essential).

BTW my program is working, and the grammar matched a large and complicated input file, as desired. So despite the documentation issue, I was able to figure it out. The biggest hassle was having to use pointers to C++ string and map (I considered defining YYSTYPE as a polymorphic class, but decided against it).


Tom Roberts

[#] The bison C++ interface seems overly complex -- I am linking into a >2 million-line scientific program with a custom build system, and the many files generated by bison in C++ mode are difficult to deal with; the single .tab.c file is better for me. The build system does not know about sfxread.y, only sxfread.tab.c, so bison is used as a "text editor" rather than a build tool; fortunately the grammar won't change often, if at all (it has only 8 terminals). As it is #include-d inside a namespace, I can still add additional parsers to the program in the future.




Joel E. Denny wrote:
Hi Tom,

On Sat, 25 Dec 2010, Tom Roberts wrote:

I want to have the grammar define the keywords as literal strings, so on first
call my yylex() will build up its list of keywords by scanning yytname[] for
entries beginning with '"'.

As you no doubt know, yytname is requested using %token-table. However, since 2001 (according to our vc log), Bison's TODO has described %token-table as a broken feature that might not be worth keeping. Unfortunately, %token-table originated before my time, and I have no practical experience with it, so it's hard for me to determine the best way forward.

In both bison-2.4.3/doc/bison.info and on page 84 of
http://www.gnu.org/software/bison/manual/bison.pdf , the example code to map a
string terminal to the return value from yylex() is incomplete -- it only
gives a loop over yytname[], without telling the user what value to return.

I agree that the documentation is unclear here.

The loop variable is i, and the value that must be returned is yytoknum[i].
But yytoknum[] is inside that #ifdef and is not available.

The manual does not document yytoknum, and that's usually a sign it wasn't intended for users.

Does anyone remember exactly how yytname was originally intended to be used?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]