bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Help with benchmarks on termination


From: Akim Demaille
Subject: Help with benchmarks on termination
Date: Sun, 23 Feb 2020 19:10:56 +0100

In a branch I'm working on, the automaton may have several final states.  This 
is a problem for the parsing termination (YYACCEPT) which currently reads in 
yacc.c:

> /*--------------------------------------------------------------------.
> | yysetstate -- set current state (the top of the stack) to yystate.  |
> `--------------------------------------------------------------------*/
> yysetstate:
>   YYDPRINTF ((stderr, "Entering state %d\n", yystate));
> [...]
> 
>   if (yystate == YYFINAL)
>     YYACCEPT;
> 
>   goto yybackup;

I would need to define several YYFINAL and check them all.  It would be 
inelegant, and require many changes.

So I tried another approach: instead of specifying the list of accepting 
states, just recognize that "accepting" means "reducing to $accept" (which is 
the real technical start symbol of the grammar recognized by the automaton).  
There are several places to do that, but in order to avoid establishing a LAC 
context uselessly right before accepting, a nice place seems to be right before 
going to yyreduce:

> @@ -1732,9 +1735,11 @@ yyread_pushed_token:]])[
>    if (yyn <= 0)
>      {
>        if (yytable_value_is_error (yyn))
> -        goto yyerrlab;]b4_lac_if([[
> +        goto yyerrlab;
> +      yyn = -yyn;
> +      if (yyr1[yyn] == YYNTOKENS)
> +        YYACCEPT;]b4_lac_if([[
>        YY_LAC_ESTABLISH;]])[
> -      yyn = -yyn;
>        goto yyreduce;
>      }
>  
> @@ -1763,7 +1768,9 @@ yyread_pushed_token:]])[
>  yydefault:
>    yyn = yydefact[yystate];
>    if (yyn == 0)
> -    goto yyerrlab;
> +    goto yyerrlab;
> +  else if (yyr1[yyn] == YYNTOKENS)
> +    YYACCEPT;
>    goto yyreduce;

Technically, $accept is the first nonterminal, so its symbol number is 
YYNTOKENS.  Thus "yyr1[yyn] == YYNTOKENS" really means "the LHS of rule yyn is 
$accept".

I was surprised by the benches: it's faster than checking if yystate is YYFINAL.

    ================= final-state
    BM_parse        10655 ns      10648 ns     262867
    ================= final-state-nofinal
    BM_parse         9949 ns       9945 ns     286108

(This is Google benchmark).  It may surprise that the termination criterion has 
such an impact, but since we check it each time we enter a state, it's quite 
frequent.  And checking right before reducing is less frequent, but requires 
looking up in a table (yyr1) as opposed to just checking against a constant.

But then, for some reason I just don't understand, when using LAC, this 
approach slows down the parser:

    ================= final-state-lac
    BM_parse        13951 ns      13939 ns     203483
    ================= final-state-nofinal-lac
    BM_parse        15114 ns      15107 ns     185246

and when I repeat this, I get consistently the same results.  So I'm puzzled, 
and I don't know how to interpret this.  The benches were run with the 
traditional calculator, maybe this is not relevant enough?

Please, if you have some good benchmark somewhere, try the changes I suggested 
above, and report your mileage.  It would be great if we could speed up the 
parsers by improving this.

In any case, I am very interested by bench results.  To reproduce my 
experiments, please find attached:
- the patch that installs the alternative ending controlled by -Dnofinal 
(currently available as V1 here: https://github.com/akimd/bison/tree/multifinal)
- a benchmark case (a *.y file), and
- a Makefile to run the benches.

You'll have to edit the Makefile to fix the paths to Bison, CXX, etc.  Then 
"make clean bench".

If you have better ideas on how to deal with termination, I'm all ears!

Thanks in advance.


Attachment: nofinal.patch
Description: Binary data

Attachment: final-state.y
Description: Binary data

Attachment: Makefile
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]