Re: [RFC] Methods and functions

poke-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC] Methods and functions

From:	Jose E. Marchesi
Subject:	Re: [RFC] Methods and functions
Date:	Mon, 04 May 2020 22:17:38 +0200
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
    
    Hmm, allowing to optionally use defmethod instead of defun is an
    interesting idea.
    
Actually, last night I settled my mind and came to what I think is a
much better solution:

Methods are indeed not like functions.
Functions are actually useful inside struct types.
We should allow both.
We should use `method' and not `defmethod'.
The restrictions can be reduced to only three, and very intuitive ones!

The little writeup below summarizes the new approach.
Please let me what you think.

PS: Of course everything described below is by now fully implemented in
    master ;)

** Understanding Poke Methods

*** The Packet

    First we need to define some structure to use as an example.  Let's say we
    are interesting in poking Packets, as defined by the Packet Specification
    1.2 published by the Packet Foundation (none less).

    In a nutshell, each Packet starts with a byte whose value is always 0xab,
    followed by a byte that defines the size of the payload.  A stream of
    bytes conforming the payload follows, themselves followed by another
    stream of the same number of bytes with "control" values.

    We could translate this description into the following Poke struct type
    definition:

    deftype Packet =
      struct
      {
        byte magic = 0xab;
        byte size;
        byte[size] payload;
        byte[size] control;
      };

    See the Poke manual for details on types, initialization values,
    constraint expressions etc.

    There are some details described the Packet Specification 1.2 that are
    not covered in this simple definition, but we will be attending to that
    later in this article.
      
*** The Process of Building Structs

    Given the definition of a struct type like Packet, there are only two ways
    to build a struct value in Poke.

    One is to _map_ it from some IO space.  This is achieved using the map
    operator:

    (poke) Packet @ 12#B
    Packet {
      magic = 0xab,
      size = 2,
      payload = [0x12UB,0x30UB],
      control = [0x1UB,0x1UB]
    }
   
    The expression above maps a Packet starting at offset 12 bytes, in the
    current IO space.  See the Poke manual for more details on using the map
    operator.

    The second way to build a struct value is to _construct_ one, specifying
    the value to some, all or none of its fields.  It looks like this:

    (poke) Packet {size = 2, payload = [1UB,2UB]}
    Packet {
      magic = 0xab,
      size = 2,
      payload = [0x1UB,0x2UB],
      control = [0x0UB,0x0UB]
    }

    In either case, building a struct involves to determine the value of all
    the fields of the struct, one by one.  The order in which the struct
    fields are built is determined by the order of appearance of the fields in
    the type description.

    In our example, the value of `magic' is determined first, then `size',
    `payload' and finally `control'.  This is the reason why we can refer to
    the values of previous fields when defining fields, such as in the size of
    the `paylod' array above, but not the other way around: by the time
    `payload' is mapped or constructed, the value of `size' has already been
    mapped or constructed.

    What happens behind the curtains is that when poke finds the definition of
    a struct type, like Packet, it compiles two functions from it: a mapper
    function, and a constructor function.  The mapper function gets as
    arguments the IO space and the offset from which to map the struct value,
    whereas the constructor function gets the template specifying the initial
    values for some, or all of the fields; reasonable default values (like
    zeroes) are used for fields for which no initial values have been
    specified.

    These functions, mapper and constructor, are invoked to create fresh
    values when a map operator @ or a struct constructor is used in a Poke
    program, or at the poke prompt.

*** Variables in Struct Types    

    Fields are not the only entity that can appear in the definition
    of a struct type.

    Suppose that after reading more carefully the Packet Specification 1.2
    (that spans for several thousand of pages) we realize that the field
    `size' doesn't really stores the number of elements of the payload and
    control arrays, like we thought initially.  Or not exactly: the Packet
    Foundation says that if `size' has the special value 0xff, then the size
    is zero.

    We could of course do something like this:

    deftype Packet =
      struct
      {
        byte magic = 0xab;
        byte size;

        byte[size == 0xff ? 0 : size] payload;
        byte[size == 0xff ? 0 : size] control;
      };

     However, we can avoid replicating code by using a variable instead:

     deftype Packet =
       struct
       {
         byte magic = 0xab;
         byte size;

         defvar real_size = (size == 0xff ? 0 : size);

         byte[real_size] payload;
         byte[real_size] control;
       };

     Note how the variable can be used after it gets defined.  In the
     underlying process of mapping or constructing the struct, the variable is
     incorporated into the lexical environment.  Once defined, it can be used
     in constraint expressions, array sizes, etc.  We will see more about this
     later.

     Incidentally, it is of course possible to use global variables as well.
     For example:

     defvar packet_special = 0xff;
     deftype Packet =
       struct
       {
         byte magic = 0xab;
         byte size;

         defvar real_size = (size == packet_special ? 0 : size);

         byte[real_size] payload;
         byte[real_size] control;
       };

     In this case, the global `packet_special' gets captured in the lexical
     environment of the struct type (in reality in the lexical environment of
     the implicitly created mapper and constructor functions) in a way that if
     you later modify `packet_special' the new value will be used when
     mapping/constructing _new_ values of type Packet.  Which is really cool,
     but lets not get distracted from the main topic... :)

*** Functions in Struct Types

    Further reading of the Packet Specification 1.2 reveals that each Packet
    has an additional `crc' field.  The content of this field is derived from
    both the payload bytes and the control bytes.

    But this is no normal CRC we are talking about.  Instead, it is a special
    function developed by the CRC Foundation in partnership with the Packet
    Foundation, called superCRC (patented, TM).

    Fortunately, the CRC Foundation distributes a pickle `supercrc.pk', that
    provides a `calculate_crc' function with the following spec:

      defun calculate_crc = (byte[] data, byte[] control) int:

    So let's use the function like this in our type, after loading the
    supercrc pickle:

      load supercrc;

      deftype Packet =
        struct
        {
          byte magic = 0xab;
          byte size;

          defvar real_size = (size == 0xff ? 0 : size);
 
          byte[real_size] payload;
          byte[real_size] control;

          int crc = calculate_crc (payload, control);
        };
     
     However, there is a caveat: it happens that the calculation of the CRC
     may involve arithmetic and division, so the CRC Foundation warns us that
     the `calculate_crc' function may raise E_div_by_zero.  However, the
     Packet 1.2 Specification tells us that in these situations, the `crc'
     field of the packet should contain zero.  If we used the type above, any
     exception raised by `calculate_crc' would be raised by the
     mapper/constructor, too bad.

     A solution is to use a function that takes care of the extra needed
     logic, wrapping calculate_crc:

     load supercrc;

     deftype Packet =
       struct
       {
         byte magic = 0xab;
         byte size;

         defvar real_size = (size == 0xff ? 0 : size);

         byte[real_size] payload;
         byte[real_size] control;

         defun corrected_crc = int:
         {
           try return calculate_crc (payload, control);
           catch if E_div_by_zero { return 0; }
         }
         
         int crc = corrected_crc;
       };

     Again, note how the function is accessible after its definition.  Note as
     well how both fields and variables and other functions can be used in the
     function body.  There is no difference to define variables and functions
     in struct types than to define them inside other functions or on the
     top-level environment: the same lexical rules apply.

*** Methods

     At this point you may be thinking something on the line of "hey, since
     variables and functions are also members of the struct, I should be able
     to access them the same way than fields, right?".

     So you will want to do:

     (poke) defvar p = Packet @ 12#B
     (poke) p.real_size
     (poke) p.corrected_crc

     But sorry, this won't work.

     To understand why, think about the struct building process we sketched
     above.  The mapper and constructor functions are derived/compiled from
     the struct type.  You can imagine them to have prototypes like:

       Packet_mapper (IOspace, offset) -> Packet value
       Packet_constructor (template)   -> Packet value

     You can also picture the fields, variables and functions in the struct
     type specification as being defined inside the bodies of Packet_mapper
     and Packet_constructor, as their contents get mapped/constructed.  For
     example, let's see what the mapper does:

       Packet_mapper:

         . Map a byte, put it in a local `magic'.
         . Map a byte, put it in a local `size'.
         . Calculate the real size, put it in a local `real_size'.
         . Map an array of real_size bytes, put it in a local `payload'.
         . Map an array of real_size bytes, put it in a local `control'.
         . Compile a function, put it in a local `corrected_crc'.
         . map a byte, call the function in the local `corrected_crc',
           complain if the values are not the same, otherwise put the
           mapped byte in a local `crc'.
         . Build a struct value with the values from the locals `magic',
           `size', `payload', `control' and `crc', and return it.

     The pseudo-code for the constructor would be almost identical.  Just
     replace "map a byte" with "construct a byte".

     So you see, both the values for the mapped fields and the values for the
     variables and functions defined inside the struct type end as locals of
     the mapping process, but only the values of the fields are actually put
     in the struct value that is returned in the last step.

     This is where methods come in the picture.  A method looks very similar
     to a function, but it is not quite the same thing.  Let me show you an
     example:

     load supercrc;

     deftype Packet =
       struct
       {
         byte magic = 0xab;
         byte size;

         defvar real_size = (size == 0xff ? 0 : size);

         byte[real_size] payload;
         byte[real_size] control;

         defun corrected_crc = int:
         {
           try return calculate_crc (payload, control);
           catch if E_div_by_zero { return 0; }
         }
         
         int crc = corrected_crc;

         method c_crc = int:
         {
           return corrected_crc;
         }
       };
     
     We have added a method `c_crc' to our Packet struct type, that just
     returns the corrected superCRC (patented, TM) of a packet.  This can be
     invoked using dot-notation, after a Packet value is mapped/constructed:

     (poke) defvar p = Packet @ 12#B
     (poke) p.c_crc
     0xdeadbeef

     Now, the important bit here is that the method returns the corrected crc
     _of a packet_.  That's it, it actually operates on a packet value.  This
     packet value gets implicitly passed as an argument whenever a method is
     invoked.
  
     We can visualize this with the following "pseudo Poke":

     method c_crc = (Packet SELF) int:
     {
        return SELF.corrected_crc;
     }

     Fortunately, poke takes care to recognize when you are referring to
     fields of this implicit struct value, and does The Right Thing(TM) for
     you.  This includes calling other methods:

     method foo = void: { ... }
     method bar = void:
     {
      [...]
      foo;
     }

     The corresponding "pseudo-poke" being:

     method bar = (Packet SELF) void:
     {
      [...]
      SELF.foo ();
     }

     It is also possible to define methods that modify the contents of struct
     fields, no problemo:

     defvar packet_special = 0xff;

     deftype Packet =
       struct
       {
         byte magic = 0xab;
         byte size;
         [...]

         method set_size = (byte s) void:
         {
           if (s == 0)
             size = packet_special;
           else         
             size = s;
         }
       };

      This is what is commonly known as a "getter".  Note, incidentally, how a
      method can also use regular variables.  The Poke compiler knows when to
      generate a store in a normal variable such as `packet_special', and when
      to generate a set to a SELF field.

*** A couple of restrictions     

    Given the different nature of the variables, functions and methods, there
    are a couple of restrictions:

    a) Methods can't access variables and function defined in the struct type.

       This is not gratuitous: when you think about the mapping/construction
       process, you will note these variables and functions are locals to the
       mapper/constructor, which has already been executed at the time we are
       in the position to invoke a method!

       If you try to access a variable or function defined inside a struct
       method, you will get a nice "invalid reference to struct
       {variable,function}" error message from the compiler.

    b) Functions can't set fields defined in the struct type.  This will be
       rejected by the compiler:

       deftype Foo =
        struct
        {
          int field;
          defun wrong = void: { field = 10; }
        };

       Again, remember the construction/mapping process.  When a function
       accesses a field of the struct type like in the example above, it is
       not doing one of these pseudo `SELF.field = 10'.  Instead, it is simply
       updating the value of the local create in this step in Foo_mapper:
   
       Foo_mapper:
       
        . Map an int, put it in a local `field'.
        [...]

       Setting that local would impact the mapping of the subsequent fields if
       they refer to `field' (for example, in their constraint expression) but
       it wouldn't actually alter the value of the field `field' in the struct
       value that is created and returned from the mapper!

       This is very confusing, so we just disallow this with a compiler error
       "invalid assignment to struct field", for your own sanity :)

    c) Methods can't be used in field constraint expressions, nor in variables
       or functions defined in a struct type.

       How could they be?  The field constraint expressions, the
       initialization expressions of variables, and the functions defined in
       struct types are all executed as part of the mapper/constructor and, at
       that time, there is no struct value yet to pass to the method.

       If you try to do this, the compiler will greet you with a "invalid
       reference to struct method" message.
[Prev in Thread]
Current Thread
[Next in Thread]
[RFC] Methods and functions, Jose E. Marchesi, 2020/05/02
- Re: [RFC] Methods and functions, Dan Čermák, 2020/05/03
  - Re: [RFC] Methods and functions, Jose E. Marchesi, 2020/05/03
- Re: [RFC] Methods and functions, Egeyar Bagcioglu, 2020/05/03
  - Re: [RFC] Methods and functions, Jose E. Marchesi, 2020/05/04
    - Re: [RFC] Methods and functions, Jose E. Marchesi <=
    - Re: [RFC] Methods and functions, Dan Čermák, 2020/05/04
    - Re: [RFC] Methods and functions, Dan Čermák, 2020/05/04
    - Re: [RFC] Methods and functions, Jose E. Marchesi, 2020/05/04
    - Re: [RFC] Methods and functions, Egeyar Bagcioglu, 2020/05/05
    - Re: [RFC] Methods and functions, Jose E. Marchesi, 2020/05/05
    - Re: [RFC] Methods and functions, Egeyar Bagcioglu, 2020/05/05
Prev by Date: Re: [RFC] Methods and functions
Next by Date: Re: [RFC] Methods and functions
Previous by thread: Re: [RFC] Methods and functions
Next by thread: Re: [RFC] Methods and functions
Index(es):
- Date
- Thread