bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Python support: new switch simulation


From: Dennis Heimbigner
Subject: Python support: new switch simulation
Date: Mon, 09 Sep 2013 14:43:28 -0600
User-agent: Thunderbird 2.0.0.24 (Windows/20100228)

Figured out a new way to simulate a switch for the user
defined actions. This replaces the current
if...elif... simulation
>From 7e05f47b804f09a5210045f3cb3729cb6d4b7429 Mon Sep 17 00:00:00 2001
From: dmh <address@hidden>
Date: Mon, 9 Sep 2013 14:30:52 -0600
Subject: [PATCH] Python: New switch simulation

Figured out a new way to simulate a switch for the user
defined actions. This replaces the current
if...elif... simulation.

The basic idea is that each user action is embedded in a
function call yycase_<n> where <n> is the associated yyn
index in yyaction(). A dictionary is setup where each
key,value pair is {<n>, yycase_<n>}.  This allows the
yyaction() procedure to use yyn to probe the dictionary and
execute the corresponding yycase_<n> function.  The
complication is that we need to pass out of the yycase_
function, the yyval value produced by the user's action
code.  This relies on the fact that passing a mutable object
into a function allows the body of that function to modify
the mutable object.  So, the yyval value in yyaction is
wrapped in a mutable class instance YYVAL with a single
field called yyval.  The class instance instance is passed
into the yycase_ procedure, extracted for use by the action
code, and then any modified yyval is re-stored into the
mutable class object.  the yyaction() procedure then
extracts it and used it as its yyval value.

Specific changes:
* data/python.m4: Modified b4_case to support new switch simulation,
* data/lalr1.py: implement the new switch simulation
* doc/bison.texi: Documented the new switch simulation.
---
 data/lalr1.py  | 40 +++++++++++++++----------
 data/python.m4 | 17 ++++++++---
 doc/bison.texi | 95 +++++++++++++++++++++++++++++++++++++++++++++++++---------
 3 files changed, 119 insertions(+), 33 deletions(-)

diff --git a/data/lalr1.py b/data/lalr1.py
index 5b14e23..66b51de 100644
--- a/data/lalr1.py
+++ b/data/lalr1.py
@@ -379,6 +379,17 @@ class Lexer :
 ]b4_python_indent(b4_percent_code_get([lexer]),2)[
 ]])[
 
+# The yyaction functions now all become global 
+# We also need a class to act as a call by reference
+# return for yyval.
+class YYVAL:
+  def __init__(self,yyval): self.yyval = yyval
+#end YYVAL
+
+yyswitch = {}
+
+]b4_user_actions[
+
 ##################################################
 # Primary Parser Class
 ##################################################
@@ -455,7 +466,7 @@ class 
]b4_parser_class_name[]b4_percent_define_get3([extends], [(], [)])[ :
   ##################################################
 
   # For python, pass in the yyerror function
-  # so simplify access so the caller does not need to prefix it.
+  # to simplify access so the caller does not need to prefix it.
   def yyaction (self, yyn, yystack, yylen, yyerror) :
     yylval = None
     ]b4_locations_if([[yyloc = self.yylocation (yystack, yylen)]])[
@@ -472,24 +483,23 @@ class 
]b4_parser_class_name[]b4_percent_define_get3([extends], [(], [)])[ :
 
     self.yy_reduce_print (yyn, yystack)
 
-    # Simulate a switch in python using if ... elif ... else .
-    # This is inefficient, but until python adds a true switch
-    # and in light of the fact that one cannot do assignments
-    # in lambda expression, this seems to be the best solution available.
-    # The performance cost is potentially horrendous.
-    # Advice on an alternative that allows for assignment would be welcome.
-    # Note that the user_actions gives no clue about if the case call
-    # is the first or not, so we need to fake an initial false call.
-
+    # Simulate a switch in python using a dictionary
+    # that maps states to functions defining the user defined
+    # actions. (See just be for the YYParser class definition
+    # above). This depends on the fact that passing mutable objects
+    # into a function allows the function to modify that object.
     # Note that the action body is indentation sensitive
 
-    if False :
+    if (yyn in yyswitch) :
+      action = yyswitch[yyn]
+      yyvalp = YYVAL(yyval)
+      status = action(yyvalp, yyn, yystack, yylen, yyerror]b4_locations_if(
+[, yyloc])[)
+      yyval = yyvalp.yyval
+      if(status != None) : return status
+    else: # no such action index; ignore
       pass
 
-]b4_user_actions[
-
-    else: pass
-
     self.yy_symbol_print ("-> $$ =",
                           yyr1_[yyn], yyval]b4_locations_if([, yyloc])[)
 
diff --git a/data/python.m4 b/data/python.m4
index 2c64a56..88872b7 100644
--- a/data/python.m4
+++ b/data/python.m4
@@ -228,13 +228,22 @@ m4_define([b4_integral_parser_table_define],
 # -----------------
 # This is complicated for python because
 # each line of the body of the action 
-# must be indented more than the 4 blanks
-# that precede the elif line.
+# must be indented more than the 2 blanks
+# because it is now inside a try statement.
 # Before that, however, any braces must be
 # elided.
 
-m4_define([b4_case], [    elif yyn == [$1] :
-b4_python_indent(b4_debrace($2),6)])
+m4_define([b4_case],
+[[def yycase_$1 (yyvalp, yyn, yystack, yylen, yyerror]b4_locations_if(
+[, yyloc])[):
+  yyval = yyvalp.yyval
+  try :
+]b4_python_indent(b4_debrace($2),4)[
+  finally :
+    yyvalp.yyval = yyval
+#end yycase_$1
+
+yyswitch[$1] = yycase_$1]])
 
 ## ------------------------- ##
 ## Assigning token numbers.  ##
diff --git a/doc/bison.texi b/doc/bison.texi
index 1932c48..47d2aaa 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -12242,7 +12242,7 @@ There are, however, some heuristics that can help.
 is at indent level zero (0).
 @item Single line actions (e.g. @address@hidden@}}) should cause no problems.
 @item Multiple line actions can be indented as desired, but note
-that every line of the action will automatically  be indented by 6 blanks.
+that every line of the action will automatically  be indented by 4 blanks.
 @end itemize
 
 In python code, line breaks can also have syntactic and semantic
@@ -12664,12 +12664,9 @@ appear in an action.  The actual definition of these 
symbols is
 opaque to the Bison grammar, and it might change in the future.  The
 only meaningful operation that you can do, is to return them.
 @xref{Python Action Features}.
-
 Note that of these three symbols, only @code{YYACCEPT} and
 @code{YYABORT} will cause a return from the @code{yyparse}
address@hidden parsers include the actions in a separate
-method than @code{yyparse} in order to have an intuitive syntax that
-corresponds to these C macros.}.
+method.
 
 @item
 Python is dynamically typed, so @code{%union}, @code{%type},
@@ -12680,15 +12677,6 @@ do some type checking.  See @ref{Python Semantic 
Values} and
 @ref{Python Action Features}.
 
 @item
-Python does not (yet) contain a @code{switch} statement.
-Instead, a sequence of @code{if ... elif...else} statements
-is used to simulate a @code{switch}.  This can have significant
-performance consequences for large grammars with many states.
-It should be noted that a dictionary mapping states to
address@hidden statments does not work because the action code
-modified variables outside the action.
-
address@hidden
 Python supports exceptions, but there is no way to specify the exceptions
 raised by a function. So, any exception related directive is ignored.
 
@@ -12723,6 +12711,85 @@ be used to define other classes used by the parser.
 The epilogue code is the last code in the produced python parser file.
 @end itemize
 
+The most important difference concerns how user defined
+actions are handled. In C,C++, and Java, a switch statement
+is used.
+Python, however,  does not (yet) contain a @code{switch} statement.
+It is simulated in a somewhat clumsy fashion as follows.
+The basic idea is that each user action is embedded in a function
+called yycase_<n>, for example, where <n> is the associated yyn index
+in @code{yyaction()}. A dictionary, @code{yyswitch},
+is setup where each key,value pair is
address@hidden<n>, yycase_<n>@}.  This allows the yyaction() procedure to use 
yyn to
+probe the dictionary (in o(1) time)
+and execute the corresponding yycase_<n> function.
+
+The complication is that the user-defined action can modify the
+$$ value in its code. The $$ value is actually a reference to the
+variable @code{yyval}. So, somehow, the final value of $$ must be
+passed out of the user's action code.
+
+In order to make this work, it is necessary to rely on a
+``feature'' of python function calls.  This feature is that
+passing a mutable object into a function as an argument allows
+the body of that function to modify that mutable argument.
+
+So, the solution is to define a python helper class, YYLVAL,
+defined as follows:
address@hidden
+class YYVAL:
+  def __init__(self, yyval) : self.yyval = yyval
address@hidden example
+
+Assuming the yyvalp argument is an instance of  YYVAL,
+we get the following code sequence for a user action.
address@hidden
+def yycase_<n> (yyvalp, yyn, yystack, yylen, yyerror, yyloc) :
+  yyval = yyvalp.yyval
+  try :
+    <user's action code with braces removed and indented 4 spaces>
+  finally :
+    yyvalp.yyval = yyval
+#end yycase_<n>
+
+yyswitch[<n>] = yycase_<n>
address@hidden example
+
+Remember that this function may return something like YYABORT
+if the user code does such a return statement. For this reason,
+the user's code is wrapped in a @code{try:...finally:...}
+to ensure that yyval is always properly stored.
+Also note that if @code{%locations} is not defined,
+the the @code{yyloc} argument will not be present.
+
+The corresponding generic code in @code{yyaction ()}
+in the parser looks like this.
address@hidden
+    if (yyn in yyswitch) :
+      action = yyswitch[yyn]
+      yyvalp = YYVAL(yyval)
+      status = action(yyvalp, yyn, yystack, yylen, yyloc)
+      yyval = yyvalp.yyval
+      if(status != None) : return status
+    else: # no such action index; ignore
+      pass
address@hidden example
+
+Here, the @code{yyswitch} dictionary is probed to see if the
+specified @code{yyn} is defined. Assuming so, then the
+corresponding user code procedure is extracted.
+The current value of yyval is stored in an instance of YYVAL,
+and the user action procedure is invoked with
+all the necessary arguments. The return result from the user action
+procedure is captured. If it is @code{None}, then no return was
+executed and yyaction can continue on. If the return status is
+something like YYABORT, then an immediate return is executed.
+
+One last point, since the user code is in its own procedure
+and is defined at the module level, the code may need to use
+python @code{global} statements to access other module level
+declarations.
+
 @node Python Declarations Summary
 @subsection Python Declarations Summary
 
-- 
1.8.4.rc0.1.g8f6a3e5


reply via email to

[Prev in Thread] Current Thread [Next in Thread]