bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Push parser


From: Odd Arild Olsen
Subject: Push parser
Date: Tue, 9 Mar 2004 09:43:08 +0100
User-agent: KMail/1.5.1

I have rewritten yacc.c into push.c, a push parser.

The skeleton and some examples can be found at www.oao.no/filer/

The changes from yacc.c are concentrated around some spots. Perhaps the 
changes can be merged with yacc.c or push.h structured in a way that allows 
patches for yacc.c to be applied to push.c too?

The main points of the push.c skeleton are described in the README-push file 
inserted below

Odd Arild Olsen
---------------------------------------------

README-push


push.c is a Bison skeleton file for generating a parser which can be
called by a tokenizer. push.c is based on yacc.c and the features are
mostly the same.

yacc.c based parsers only return upon error or end of
parsing (e.g. when reaching the end of an input file). These parsers
call the function yylex to get the next token when needed.

push.c based parsers return to get the next token/yylval. 
The parser must be reentrant and keep its state between calls. All
internal variables of the parser are therefore kept in an external structure.
The prototype of push.c based parsers is

void yyparse(void *p, struct yyuvars *u) 

p is a pointer to the structure of the internal variables, which
users need not know anything about, just keep it around unaltered.

u is a pointer to the user structure:

struct yyuvars {
        int instance;
        int yyresult;
        int token;
        YYSTYPE yylval;
        YYLTYPE yylloc;
};

The instance is a number which increments for each created structure,
 kind of instance serial number. yyresult is the yyparse
return value, set to GIVE_ME_TOKEN (=4) when a new
token must be supplied. Other return values have the same cause as for 
yacc.c-parsers.
yylloc is the standard location structure, only present when
%locations is specified. 

Memory blocks for the structures are allocated and initialized by the two
functions

void * yypvarsinit(void)
struct yyuvars * yyuvarsinit(void)

The user must call these two to get one set of structures for each parser
instance.

The grammar file has the same structure and content as a grammar file for
yacc.c. 

However, the main and yylex functions are different from the yacc.c case:

void yylex (struct yyuvars *u, ...) 
{
        u->token= ...
        u->yylval= ...
}
        
int main (void)
{
        void * pv;
        struct yyuvars * uv;
        pv= yypvarsinit ();
        uv= yyuvarsinit ();
        do {
                    yyparse (pv, uv);
                    if(uv->result == GIVE_ME_TOKEN) yylex (uv, ...);
                    else /* error, parser finished ... */
        } while(...);
        free (pv);
        free (uv); 
}


push.c parsers are pure only, %pure_parser has no effect. 

Several grammars may be used in the same program as usual by 
%name-prefix="...". Renames the following names: yyparse, yyerror,
yydebug, yypvars, yyuvars, yypvarsinit, yyuvarsinit. The parser knows
nothing about the lexer function, you may call it whatever you want.

Since the parser do not call the lexer there is no need for user
specified parse parametres. %lex-param / %parse-param are therefore not
effective. 

The error function prototype is

void yyerror(const char *)

However, if %locations is specified it becomes 

void yyerror(stuct yyuvars *, const char *)

to make the location structure available through yyuvars.yylloc

I have tested the push parser using Bison 1.875c on a Linux system. 

This push parser seems to work with regards to reentrancy, several
instances of the same grammar and several grammars. I have not tested
reentrancy when called from different threads, e.g. risking a call
while yyparse is already running. As yyparse do not depend on static local
or global variables I do not belive this could be a problem,
correct me if I'm wrong.

As usual: please contribute to this parser to fix errors, clean it up
and make it more usable.

RPN CALCULATOR
---------------

Here is a Bison input file for the push.c based rpn calculator. It
creates two parser instances and reads statements from two files, one
for each instance. Files are read until an 'x' appears.

/* Reverse polish notation calculator.  */

     %{
#include <math.h>
#include <stdio.h>
#include<stdlib.h>
#define YYSTYPE double
#define YYDEBUG 1
     %}

     %token NUM

     %% /* Grammar rules and actions follow.  */

     input:    /* empty */
             | input line
     ;
     
     line:     '\n'
             | exp '\n'      { fprintf (stderr,"\nResult%d: \t%.10g", 
                                        uv->instance,$1);  
                                        };
     
     exp:      NUM           { $$ = $1;           }
             | exp exp '+'   { $$ = $1 + $2;      }
             | exp exp '-'   { $$ = $1 - $2;      }
             | exp exp '*'   { $$ = $1 * $2;      }
             | exp exp '/'   { $$ = $1 / $2;      }
              /* Exponentiation */
             | exp exp '^'   { $$ = pow ($1, $2); }
              /* Unary minus    */
             | exp 'n'       { $$ = -$1;          }
             | 'x' {YYACCEPT}
     ;
     %%
     
#include <ctype.h>

void
yyerror(const char *str) 
{
} 


void
yylex (struct yyuvars *u, FILE*fp)
{
        int c;
        
        /* Skip white space.  */
        while ((c = fgetc (fp)) == ' ' || c == '\t');
        /* Process numbers.  */
        if (c == '.' || isdigit (c))
        {
                ungetc (c, fp);
                fscanf (fp,"%lf", &u->yylval);
                u->token= NUM;
                return;
        }
        /* Return end-of-input.  */
        if (c == EOF){
                u->token= 0;
                return;
        }
        /* Return a single char.  */
        u->token= c;
        return;
}

int
main (void)
{
        void *pv[2];
        struct yyuvars *uv[2];
        FILE *fp[2];
        int i;
        //yydebug=1;
        for(i=0; i<2; i++) {
                fp[i]= fopen(i==0?"1.dat":"2.dat", "r");
                pv[i]= yypvarsinit();
                uv[i]= yyuvarsinit();
        }
        do {
                for(i=0; i<2; i++) {
                        if(uv[i]->yyresult != 0) {
                                yyparse(pv[i], uv[i]);
                                
                        }
                        if(uv[i]->yyresult == 1) {
                                fprintf(stderr,"\nSyntax error%d",
                                        uv[i]->instance);
                        }
                }
                for(i=0; i<2; i++) {
                        if(uv[i]->yyresult != 0) {
                                yylex(uv[i], fp[i]);
                        }
                }
        } while (uv[0]->yyresult!=0 || uv[1]->yyresult!=0);
        for(i=0; i<2; i++) {
                free(pv[i]);
                free(uv[i]);
                fclose(fp[i]);
        }
        fprintf(stderr, "\n");
        return 0;
}


The data files:

1.dat:

1 4 +
2 4 *
3 6 /
3 6 w
4 4 -
4 4 +
x

2.dat:

1 4 +
4 4 +
4 4 -
3 6 w
3 6 /
2 4 *
1 4 +
x



FILES
-------


The push directory tree contains the skeleton file and some test cases for the
push parser. 

The example files are based on the standard bison examples taken from
the infopages.

rpcalc:
        This is a rpn-calculator. Two instances of the same grammar
        run concurrently to verify 'purity' (reentrant execution).

twograms:
        This example uses two grammars, one rpn and one infix
        calculator run concurrently.
 

INSTALL
-------

Move push.c to the bison skeleton directory, in my case 

/usr/local/share/bison

Go to an example directory and run make, possibly after editing
paths etc in Makefile.

run the executable, the results should be checked against the
dat-files the program eats.


HOW DOES IT WORK?
-----------------

The yyparse function is called once for each new symbol. The state
must therefore be maintained over calls. All local variables were
therefore moved into two structures, one for internal use and one for
user access. The parser is called with these two  structs
as arguments. When a new structure is passed, the function executes
from the top and down until it needs the first symbol. It then
returns. The next time yyparse is called it jumps down to where it
previously returned from. And so on. The return code reflects if the next
symbol is needed, parsing is finished or if a syntax error occurred.

As the parse function does not keep any internal variables from call to
call it is reentrant also in the sense that several instances of the
same grammar may coexist.

------------------------------------------------------------------------
Author: Odd Arild Olsen, oao(*at*)fibula(*dot*)no
These files are not maintained by the Bison maintainers and are not
distributed with Bison. Check http://www.oao.no/filer for the  most recent
version of push-YYYYMMDD.tar.gz







reply via email to

[Prev in Thread] Current Thread [Next in Thread]