bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: push parser documentation


From: Joel E. Denny
Subject: Re: push parser documentation
Date: Fri, 3 Aug 2007 21:36:47 -0400 (EDT)

On Mon, 30 Jul 2007, Bob Rossi wrote:

> BTW, I have trouble writing ChangeLog's for texinfo files. It's more
> difficult to understand what the section is than with c code. Sorry if
> this is all wrong.

I usually reverse search from my editing position for the string "@node".  
In most cases, that seems to be what developers have done here in the 
past.  Unfortunately, I've failed to be consistent on this on many 
occasions.

> +2007-07-30  Bob Rossi  <address@hidden>
> +
> +       * doc/bison.texinfo (Push Decl): Document the push parser.
> +       (Table of Symbols): Ditto.
> +       (Pure Decl): Ditto.
> +       (Declaration Summary): Ditto.

So, these days, I'd change that to "Decl Summary".  (And I just committed 
a patch to fix some of my old entries.)

You also made edits in "Multiple Parsers".

> +       (Push Parser Function, Pull Parser Function, Parser Create Function,
> +       Parser Delete Function): Add new push parser symbols.
> +       (Bison symbols): Document push-parser, push-pull-parser, yypush_parse,
> +       yypull_parse, yypstate_new and yypstate_delete.

"Table of Symbols" instead of "Bison symbols".

> @@ -4519,6 +4522,99 @@
>  You can generate either a pure parser or a nonreentrant parser from any
>  valid grammar.
>  
> address@hidden Push Decl
> address@hidden A Push Parser
> address@hidden push parser
> address@hidden push parser
> address@hidden %push-parser
> +
> +A pull parser is called once with a given amount of input and it

The amount of input isn't necessarily defined at the time of invocation.  
Imagine reading from the console.  Perhaps just drop the "with" phrase.

> takes 
> +control (blocks) until all it's input is completely parsed.  A push parser,

I see taking control as the opposite of blocking.  This explains it ok:

  http://en.wikipedia.org/wiki/Blocking_%28scheduling%29

Perhaps just drop "(blocks)".

Also, "it's" -> "its".

> +on the other hand, is called each time some new input is made available.

How about "some new input" -> "a new token"?

> +A pull parser can parse it's input faster, but it must have all the 
> +input up front and it blocks while it is parsing.  The push parser 
> +generally takes longer to parse the same amount of data, but it can
> +be told about input while the application is receiving it.
> +The push parser is typically useful when the parser is part of a 
> +main event loop and it is important for the event loop to be triggered
> +within a certain time period.  This is often the case with a GUI 
> +application.

I'm afraid I find this paragraph confusing.  Fixing the same problems I 
mentioned above would help.  Alternatively, we could drop it and assume 
your first paragraph is enough of an introduction.

> +
> +Normally, Bison generates a pull parser.  The Bison declaration 
> address@hidden says that you want the parser to be a push parser.
> +It looks like this:
> +
> address@hidden
> +%push-parser
> address@hidden example
> +
> +When a push parser is selected, Bison will generate some new symbols in
> +the generated parser.  @code{yypstate} is a structure that the generated 
> +parser uses to store the parsers state.  @code{yypstate_new} is the 

"parsers" -> "parser's".

> +function that will create a new parser instance.  @code{yypstate_delete}
> +will free the resources associated with the corresponding parser instance.
> +Finally, @code{yypush_parse} is the function that should be called whenever 
> a 
> +token is available to provide the parser.  A trivial example
> +of using a push parser would look like this:
> +
> address@hidden
> +int yystatus;
> +struct yypstate *yyps = yypstate_new ();
> +do @{
> +  yychar = yylex ();
> +  yystatus = yypush_parse (yyps);
> address@hidden while (yystatus == YYPUSH_MORE);
> +yypstate_delete (yyps);
> address@hidden example

It's not necessary to use the keyword "struct" when declaring a yypstate.

> +
> +It is acceptable to have many parser instances, of the same type of parser,
> +in memory at the same time.  However, in order to ensure that the parsers
> +are reentrant, you must provide the @code{pure-parser} declaration in the
> +grammar.

It sounds like you're saying it's fine to have those multiple parsers even 
without %pure-parser.  I believe that's true except that yynerrs is shared 
and so its value couldn't be trusted in that scenario.

The only reason I implemented impure push parsers is backward 
compatibility with the Yacc pull mode interface.  Other than that, I can't 
see why anyone would want impure mode.  I think the documentation should 
strongly discourage its usage in new push parsers.  Perhaps it should even 
start with your pure example and only mention impure mode later as an 
explanation of what happens when you drop the %pure-parser directive.

Moreover, to keep users safe, I wonder if impure push mode should have a 
global variable that counts yypstate instances.  If yypstate_new detects 
more than 1 instance, it should invoke yyerror with a message about 
%pure-parser and then return NULL.

What do you think?

> @xref{Pure Decl, ,A Pure (Reentrant) Parser}.  When this
> +is done, the @code{yychar} variable becomes a local variable in the 
> address@hidden function.  In order to allow the the next token to be 
> +passed to the @code{yypush_parse} function, its signature is changed to 
> +accept the next token as a parameter.  A reentrant push parser example 
> +would thus look like this:
> +
> address@hidden
> +int yystatus;
> +struct yypstate *yyps = yypstate_new ();
> +do @{
> +  yystatus = yypush_parse (yyps, yylex ());
> address@hidden while (yystatus == YYPUSH_MORE);
> +yypstate_delete (yyps);
> address@hidden example
> +
> +That's it. Simply pass the next token into the @code{yypush_parse} function
> +as a parameter.
> +
> +Bison also supports both the push parser interface along with the pull 
> parser 
> +interface in the same generated parser.  In order to get this functionality,
> +you should provide the grammar with the @code{%push-pull-parser} declaration.

How about "you should provide the grammar" -> "you should replace the 
@code{%push-parser} declaration"?

> +Doing this will create all of the symbols mentioned earlier along with the 
> +two extra symbols, @code{yyparse} and @code{yypull_parse}.  @code{yyparse} 
> +can be used exactly as it normally would be used.  However, the user should
> +not that it is implemented in the generated parser by calling 

"not" -> "note".

> address@hidden  This makes the @code{yyparse} function that is generated 
> +with the @code{%push-pull-parser} declaration slower than the normal 
> address@hidden function.  If the user calls the @code{yypull_parse} function
> +it will parse the rest of the input stream.  It is possible to
> +yypush_parse tokens to select a subgrammar and then yypull_parse the rest    
>                          
> +of the input stream.  If you would like to switch back and forth between
> +between parsing styles, you would have to write your own yypull_parse 
> function
> +that knows when to quit looking for input.

Some example invocations of these functions too would help.

> +
> +Adding the @code{pure-parser} declaration does exactly the same thing to the 
> +generated parser with @code{%push-pull-parser} as it did for 
> address@hidden
> +
> +When the @code{%push-parser} or @code{%push-pull-parser} declaration is used
> +then it is important to understand that all references to the @code{yyparse}
> +function in this manual corresponds to the @code{yypush_parser} function 
> +unless otherwise stated.

I'm afraid I don't understand what you mean in that last paragraph.  Can 
we drop it?

> +
>  @node Decl Summary
>  @subsection Bison Declaration Summary
>  @cindex Bison declaration summary
> @@ -4797,10 +4893,12 @@
>  in C parsers
>  is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs},
>  @code{yylval}, @code{yychar}, @code{yydebug}, and
> -(if locations are used) @code{yylloc}.  For example, if you use
> address@hidden "c_"}, the names become @code{c_parse}, @code{c_lex},
> -and so on.  In C++ parsers, it is only the surrounding namespace which is
> -named @var{prefix} instead of @samp{yy}.
> +(if locations are used) @code{yylloc}.  If you use a push parser, 
> +yypush_parse, yypull_parse, yypstate, yypstate_new and yypstate_delete will 

Put these symbol names in @code{...}.

> address@hidden Push Parser Function
> address@hidden The Push Parser Function @code{yypush_parse}
> address@hidden yypush_parse
> +
> +You call the function @code{yypush_parse} to parse a single token.  This 
> +function is available if either the @code{%push-parser} or 
> address@hidden declaration is used.  

You dropped the "%" in "%push-pull-parser" here and in several places 
below.

> address@hidden Decl, ,A Push Parser}.
> +
> address@hidden int yypush_parse (yypstate *yyps)
> +The value returned by @code{yypush_parse} is the same as for yyparse with 
> the 
> +following exception.  @code{yypush_parse} will return YYPUSH_MORE if more 
> input
> +is required to finish parsing the grammar.
> address@hidden deftypefun
> +
> address@hidden Pull Parser Function
> address@hidden The Pull Parser Function @code{yypull_parse}
> address@hidden yypull_parse
> +
> +You call the function @code{yypull_parse} to parse the rest of the input 
> +stream.  This function is available if either the @code{%push-parser} or 
> address@hidden declaration is used.  

yypull_parse is not available for %push-parser.

> address@hidden Decl, ,A Push Parser}.
> +
> address@hidden int yypull_parse (yypstate *yyps)
> +The value returned by @code{yypull_parse} is the same as for yyparse.

"yyparse" -> "@code{yyparse}".

> address@hidden deftypefun
> +
> address@hidden Parser Create Function
> address@hidden The Parser Create Function @code{yystate_new}
> address@hidden yypstate_new
> +
> +You call the function @code{yypstate_new} to create a new parser instance.  
> +This function is available if either the @code{%push-parser} or 
> address@hidden declaration is used.  
> address@hidden Decl, ,A Push Parser}.
> +
> address@hidden yypstate *yypstate_new (void)
> +The fuction will return a valid parser instance if there was memory available
> +or NULL if no memory was avialable.

Typo in 2nd "available".

> address@hidden deftypefun
> +
> address@hidden Parser Delete Function
> address@hidden The Parser Delete Function @code{yystate_delete}
> address@hidden yypstate_delete
> +
> +You call the function @code{yypstate_delete} to delete a parser instance.  
> This 
> +function is available if either the @code{%push-parser} or 
> address@hidden declaration is used.  
> address@hidden Decl, ,A Push Parser}.
> +
> address@hidden void yypstate_delete (yypstate *yyps)
> +This function will reclaim the memory associate with a parser instance.  
> After

"associated".

> +this call, you shoul no longer attempt to use the parser instance.

"should".

> address@hidden deftypefun
>  
>  @node Lexical
>  @section The Lexical Analyzer Function @code{yylex}
> @@ -9279,6 +9439,16 @@
>  @xref{Pure Decl, ,A Pure (Reentrant) Parser}.
>  @end deffn
>  
> address@hidden {Directive} %push-parser
> +Bison declaration to request a push parser.
> address@hidden Decl, ,A Push Parser}.
> address@hidden deffn
> +
> address@hidden {Directive} %push-pull-parser
> +Bison declaration to request a push and a pull parser.
> address@hidden Decl, ,A Push Parser}.
> address@hidden deffn
> +

The section "@node Decl Summary" should probably list these too since it 
claims to list all Bison declarations.

>  @deffn {Directive} %require "@var{version}"
>  Require version @var{version} or higher of Bison.  @xref{Require Decl, ,
>  Require a Version of Bison}.
> @@ -9453,7 +9623,8 @@
>  
>  @deffn {Variable} yynerrs
>  Global variable which Bison increments each time it reports a syntax error.
> -(In a pure parser, it is a local variable within @code{yyparse}.)
> +(In a pure parser, it is a local variable within @code{yyparse}. In a 
> +push parser, it is a member of yypstate.)

yynerrs is a member of yypstate only in a pure push parser.

>  @xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
>  @end deffn
>  
> @@ -9462,6 +9633,31 @@
>  parsing.  @xref{Parser Function, ,The Parser Function @code{yyparse}}.
>  @end deffn
>  
> address@hidden {Function} yypush_parse
> +The parser function produced by Bison in push mode; call this function to 
> +parse a single token.  @xref{Push Parser Function, ,The Push Parser Function 
> address@hidden
> address@hidden deffn
> +
> address@hidden {Function} yypull_parse
> +The parser function produced by Bison in push mode; call this function to 
> +parse the rest of the input stream.  
> address@hidden Parser Function, ,The Pull Parser Function 
> address@hidden
> address@hidden deffn
> +
> address@hidden {Function} yypstate_new
> +The function to create a parser instance, produced by Bison in push mode; 
> +call this function to create a new parser.
> address@hidden Create Function, ,The Parser Create Function 
> @code{yypstate_new}}.
> address@hidden deffn
> +
> address@hidden {Function} yypstate_delete
> +The function to delete a parser instance, produced by Bison in push mode; 
> +call this function to delete the memory associate with a parser.
> address@hidden Delete Function, ,The Parser Delete Function 
> @code{yypstate_delete}}.
> address@hidden deffn
> +
>  @deffn {Macro} YYPARSE_PARAM
>  An obsolete macro for specifying the name of a parameter that
>  @code{yyparse} should accept.  The use of this macro is deprecated, and

Alphabetize these entries.

Thanks for all this work.  I hope my comments don't put you out.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]