[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RFC: Introduce api.token.raw
From: |
Akim Demaille |
Subject: |
RFC: Introduce api.token.raw |
Date: |
Sun, 1 Sep 2019 18:41:13 +0200 |
The NEWS excerpt should be a good summary of the purpose of this
series of patches:
*** Variable api.token.raw: Optimized token numbers (all skeletons)
In the generated parsers, tokens have two numbers: the "external" token
number as returned by yylex (which starts at 257), and the "internal"
symbol number (which starts at 3). Each time yylex is called, a table
lookup maps the external token number to the internal symbol number.
When the %define variable api.token.raw is set, tokens are assigned their
internal number, which saves one table lookup per token, and also saves
the generation of the mapping table.
The gain is typically moderate, but in extreme cases (very simple user
actions), a 10% improvement can be observed.
I would really appreciate to get feedback about this. It is currently
available on both my GitHub account and on the official Bison repo in
the branch 'raw'.
I used "api.token.raw", but maybe another name would be better, any
idea? The name is based on the stillborn %raw directive.
Suggestion of improvements to the documentation are most welcome.
To benchmark, I used the following simple.y and Makefile, and compared
the means of './noraw --benchmark_repetitions=10' vs './raw
--benchmark_repetitions=10'.
--------------------------------------------------
%{
#include <stdio.h> /* For printf, etc. */
# include <stdlib.h> /* malloc. */
int yyparse ();
# define YYDEBUG 1
int yylex (void);
void yyerror (char const *);
#include <benchmark/benchmark.h>
%}
%union {
int val;
}
%token <val> NUM "number"
%token
PLUS "+"
MINUS "-"
STAR "*"
SLASH "/"
LPAR "("
RPAR ")"
%nterm <val> exp
%left "+" "-"
%left "*" "/"
%%
exp
: exp "+" exp { $$ = $1 + $3; }
| exp "-" exp { $$ = $1 - $3; }
| exp "*" exp { $$ = $1 * $3; }
| exp "/" exp { $$ = $1 / $3; }
| "(" exp ")" { $$ = $2; }
| "number" { $$ = $1; }
%%
const char* input;
int
yylex (void)
{
int c = *input++;
switch (c)
{
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
yylval.val = c - '0';
return NUM;
case '+': return PLUS;
case '-': return MINUS;
case '*': return STAR;
case '/': return SLASH;
case '(': return LPAR;
case ')': return RPAR;
case 0: return 0;
default: return YYUNDEFTOK;
}
}
void
yyerror (char const *s)
{
fprintf (stderr, "%s\n", s);
}
// Define another benchmark
static void BM_parse (benchmark::State& state)
{
while (state.KeepRunning())
{
input =
"1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1\0";
yyparse ();
}
}
BENCHMARK(BM_parse);
BENCHMARK_MAIN();
--------------------------------------------------
BISON = /Users/akim/src/gnu/bison/_build/9d/tests/bison
all: noraw raw
noraw.cc: simple.y
$(BISON) $< -o $@
raw.cc: simple.y
$(BISON) -Dapi.token.raw $< -o $@
%: %.cc
$(CXX) $(CXXFLAGS) -O2 $< -o $@ -lbenchmark
clean:
rm -f noraw.cc noraw raw.cc raw
--------------------------------------------------
Akim Demaille (10):
style: tidy yacc.c
api.token.raw: implement
api.token.raw: check it
api.token.raw: apply to the other skeletons
api.token.raw: cannot be used with character literals
api.token.raw: document it
parser: use api.token.raw
regen
d: handle eof in yytranslate
java: handle eof in yytranslate
NEWS | 13 +++
TODO | 6 +-
data/skeletons/bison.m4 | 3 +-
data/skeletons/c++.m4 | 6 +-
data/skeletons/glr.c | 6 +-
data/skeletons/lalr1.d | 37 +++----
data/skeletons/lalr1.java | 33 +++---
data/skeletons/yacc.c | 67 ++++++------
doc/bison.texi | 36 +++++++
src/parse-gram.c | 117 +++++++++------------
src/parse-gram.h | 118 ++++++++++-----------
src/parse-gram.y | 20 +++-
tests/input.at | 40 +++++++-
tests/javapush.at | 1 +
tests/local.at | 9 +-
tests/local.mk | 1 +
tests/scanner.at | 211 ++++++++++++++++++++++++++++++++++++++
tests/testsuite.at | 3 +
18 files changed, 515 insertions(+), 212 deletions(-)
create mode 100644 tests/scanner.at
--
2.23.0
- RFC: Introduce api.token.raw,
Akim Demaille <=
- [PATCH 01/10] style: tidy yacc.c, Akim Demaille, 2019/09/01
- [PATCH 02/10] api.token.raw: implement, Akim Demaille, 2019/09/01
- [PATCH 03/10] api.token.raw: check it, Akim Demaille, 2019/09/01
- [PATCH 04/10] api.token.raw: apply to the other skeletons, Akim Demaille, 2019/09/01
- [PATCH 05/10] api.token.raw: cannot be used with character literals, Akim Demaille, 2019/09/01
- [PATCH 06/10] api.token.raw: document it, Akim Demaille, 2019/09/01
- [PATCH 07/10] parser: use api.token.raw, Akim Demaille, 2019/09/01
- [PATCH 09/10] d: handle eof in yytranslate, Akim Demaille, 2019/09/01
- [PATCH 10/10] java: handle eof in yytranslate, Akim Demaille, 2019/09/01
- [PATCH 08/10] regen, Akim Demaille, 2019/09/01