[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC] Methods and functions
From: |
Jose E. Marchesi |
Subject: |
Re: [RFC] Methods and functions |
Date: |
Mon, 04 May 2020 22:17:38 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) |
Hmm, allowing to optionally use defmethod instead of defun is an
interesting idea.
Actually, last night I settled my mind and came to what I think is a
much better solution:
Methods are indeed not like functions.
Functions are actually useful inside struct types.
We should allow both.
We should use `method' and not `defmethod'.
The restrictions can be reduced to only three, and very intuitive ones!
The little writeup below summarizes the new approach.
Please let me what you think.
PS: Of course everything described below is by now fully implemented in
master ;)
** Understanding Poke Methods
*** The Packet
First we need to define some structure to use as an example. Let's say we
are interesting in poking Packets, as defined by the Packet Specification
1.2 published by the Packet Foundation (none less).
In a nutshell, each Packet starts with a byte whose value is always 0xab,
followed by a byte that defines the size of the payload. A stream of
bytes conforming the payload follows, themselves followed by another
stream of the same number of bytes with "control" values.
We could translate this description into the following Poke struct type
definition:
deftype Packet =
struct
{
byte magic = 0xab;
byte size;
byte[size] payload;
byte[size] control;
};
See the Poke manual for details on types, initialization values,
constraint expressions etc.
There are some details described the Packet Specification 1.2 that are
not covered in this simple definition, but we will be attending to that
later in this article.
*** The Process of Building Structs
Given the definition of a struct type like Packet, there are only two ways
to build a struct value in Poke.
One is to _map_ it from some IO space. This is achieved using the map
operator:
(poke) Packet @ 12#B
Packet {
magic = 0xab,
size = 2,
payload = [0x12UB,0x30UB],
control = [0x1UB,0x1UB]
}
The expression above maps a Packet starting at offset 12 bytes, in the
current IO space. See the Poke manual for more details on using the map
operator.
The second way to build a struct value is to _construct_ one, specifying
the value to some, all or none of its fields. It looks like this:
(poke) Packet {size = 2, payload = [1UB,2UB]}
Packet {
magic = 0xab,
size = 2,
payload = [0x1UB,0x2UB],
control = [0x0UB,0x0UB]
}
In either case, building a struct involves to determine the value of all
the fields of the struct, one by one. The order in which the struct
fields are built is determined by the order of appearance of the fields in
the type description.
In our example, the value of `magic' is determined first, then `size',
`payload' and finally `control'. This is the reason why we can refer to
the values of previous fields when defining fields, such as in the size of
the `paylod' array above, but not the other way around: by the time
`payload' is mapped or constructed, the value of `size' has already been
mapped or constructed.
What happens behind the curtains is that when poke finds the definition of
a struct type, like Packet, it compiles two functions from it: a mapper
function, and a constructor function. The mapper function gets as
arguments the IO space and the offset from which to map the struct value,
whereas the constructor function gets the template specifying the initial
values for some, or all of the fields; reasonable default values (like
zeroes) are used for fields for which no initial values have been
specified.
These functions, mapper and constructor, are invoked to create fresh
values when a map operator @ or a struct constructor is used in a Poke
program, or at the poke prompt.
*** Variables in Struct Types
Fields are not the only entity that can appear in the definition
of a struct type.
Suppose that after reading more carefully the Packet Specification 1.2
(that spans for several thousand of pages) we realize that the field
`size' doesn't really stores the number of elements of the payload and
control arrays, like we thought initially. Or not exactly: the Packet
Foundation says that if `size' has the special value 0xff, then the size
is zero.
We could of course do something like this:
deftype Packet =
struct
{
byte magic = 0xab;
byte size;
byte[size == 0xff ? 0 : size] payload;
byte[size == 0xff ? 0 : size] control;
};
However, we can avoid replicating code by using a variable instead:
deftype Packet =
struct
{
byte magic = 0xab;
byte size;
defvar real_size = (size == 0xff ? 0 : size);
byte[real_size] payload;
byte[real_size] control;
};
Note how the variable can be used after it gets defined. In the
underlying process of mapping or constructing the struct, the variable is
incorporated into the lexical environment. Once defined, it can be used
in constraint expressions, array sizes, etc. We will see more about this
later.
Incidentally, it is of course possible to use global variables as well.
For example:
defvar packet_special = 0xff;
deftype Packet =
struct
{
byte magic = 0xab;
byte size;
defvar real_size = (size == packet_special ? 0 : size);
byte[real_size] payload;
byte[real_size] control;
};
In this case, the global `packet_special' gets captured in the lexical
environment of the struct type (in reality in the lexical environment of
the implicitly created mapper and constructor functions) in a way that if
you later modify `packet_special' the new value will be used when
mapping/constructing _new_ values of type Packet. Which is really cool,
but lets not get distracted from the main topic... :)
*** Functions in Struct Types
Further reading of the Packet Specification 1.2 reveals that each Packet
has an additional `crc' field. The content of this field is derived from
both the payload bytes and the control bytes.
But this is no normal CRC we are talking about. Instead, it is a special
function developed by the CRC Foundation in partnership with the Packet
Foundation, called superCRC (patented, TM).
Fortunately, the CRC Foundation distributes a pickle `supercrc.pk', that
provides a `calculate_crc' function with the following spec:
defun calculate_crc = (byte[] data, byte[] control) int:
So let's use the function like this in our type, after loading the
supercrc pickle:
load supercrc;
deftype Packet =
struct
{
byte magic = 0xab;
byte size;
defvar real_size = (size == 0xff ? 0 : size);
byte[real_size] payload;
byte[real_size] control;
int crc = calculate_crc (payload, control);
};
However, there is a caveat: it happens that the calculation of the CRC
may involve arithmetic and division, so the CRC Foundation warns us that
the `calculate_crc' function may raise E_div_by_zero. However, the
Packet 1.2 Specification tells us that in these situations, the `crc'
field of the packet should contain zero. If we used the type above, any
exception raised by `calculate_crc' would be raised by the
mapper/constructor, too bad.
A solution is to use a function that takes care of the extra needed
logic, wrapping calculate_crc:
load supercrc;
deftype Packet =
struct
{
byte magic = 0xab;
byte size;
defvar real_size = (size == 0xff ? 0 : size);
byte[real_size] payload;
byte[real_size] control;
defun corrected_crc = int:
{
try return calculate_crc (payload, control);
catch if E_div_by_zero { return 0; }
}
int crc = corrected_crc;
};
Again, note how the function is accessible after its definition. Note as
well how both fields and variables and other functions can be used in the
function body. There is no difference to define variables and functions
in struct types than to define them inside other functions or on the
top-level environment: the same lexical rules apply.
*** Methods
At this point you may be thinking something on the line of "hey, since
variables and functions are also members of the struct, I should be able
to access them the same way than fields, right?".
So you will want to do:
(poke) defvar p = Packet @ 12#B
(poke) p.real_size
(poke) p.corrected_crc
But sorry, this won't work.
To understand why, think about the struct building process we sketched
above. The mapper and constructor functions are derived/compiled from
the struct type. You can imagine them to have prototypes like:
Packet_mapper (IOspace, offset) -> Packet value
Packet_constructor (template) -> Packet value
You can also picture the fields, variables and functions in the struct
type specification as being defined inside the bodies of Packet_mapper
and Packet_constructor, as their contents get mapped/constructed. For
example, let's see what the mapper does:
Packet_mapper:
. Map a byte, put it in a local `magic'.
. Map a byte, put it in a local `size'.
. Calculate the real size, put it in a local `real_size'.
. Map an array of real_size bytes, put it in a local `payload'.
. Map an array of real_size bytes, put it in a local `control'.
. Compile a function, put it in a local `corrected_crc'.
. map a byte, call the function in the local `corrected_crc',
complain if the values are not the same, otherwise put the
mapped byte in a local `crc'.
. Build a struct value with the values from the locals `magic',
`size', `payload', `control' and `crc', and return it.
The pseudo-code for the constructor would be almost identical. Just
replace "map a byte" with "construct a byte".
So you see, both the values for the mapped fields and the values for the
variables and functions defined inside the struct type end as locals of
the mapping process, but only the values of the fields are actually put
in the struct value that is returned in the last step.
This is where methods come in the picture. A method looks very similar
to a function, but it is not quite the same thing. Let me show you an
example:
load supercrc;
deftype Packet =
struct
{
byte magic = 0xab;
byte size;
defvar real_size = (size == 0xff ? 0 : size);
byte[real_size] payload;
byte[real_size] control;
defun corrected_crc = int:
{
try return calculate_crc (payload, control);
catch if E_div_by_zero { return 0; }
}
int crc = corrected_crc;
method c_crc = int:
{
return corrected_crc;
}
};
We have added a method `c_crc' to our Packet struct type, that just
returns the corrected superCRC (patented, TM) of a packet. This can be
invoked using dot-notation, after a Packet value is mapped/constructed:
(poke) defvar p = Packet @ 12#B
(poke) p.c_crc
0xdeadbeef
Now, the important bit here is that the method returns the corrected crc
_of a packet_. That's it, it actually operates on a packet value. This
packet value gets implicitly passed as an argument whenever a method is
invoked.
We can visualize this with the following "pseudo Poke":
method c_crc = (Packet SELF) int:
{
return SELF.corrected_crc;
}
Fortunately, poke takes care to recognize when you are referring to
fields of this implicit struct value, and does The Right Thing(TM) for
you. This includes calling other methods:
method foo = void: { ... }
method bar = void:
{
[...]
foo;
}
The corresponding "pseudo-poke" being:
method bar = (Packet SELF) void:
{
[...]
SELF.foo ();
}
It is also possible to define methods that modify the contents of struct
fields, no problemo:
defvar packet_special = 0xff;
deftype Packet =
struct
{
byte magic = 0xab;
byte size;
[...]
method set_size = (byte s) void:
{
if (s == 0)
size = packet_special;
else
size = s;
}
};
This is what is commonly known as a "getter". Note, incidentally, how a
method can also use regular variables. The Poke compiler knows when to
generate a store in a normal variable such as `packet_special', and when
to generate a set to a SELF field.
*** A couple of restrictions
Given the different nature of the variables, functions and methods, there
are a couple of restrictions:
a) Methods can't access variables and function defined in the struct type.
This is not gratuitous: when you think about the mapping/construction
process, you will note these variables and functions are locals to the
mapper/constructor, which has already been executed at the time we are
in the position to invoke a method!
If you try to access a variable or function defined inside a struct
method, you will get a nice "invalid reference to struct
{variable,function}" error message from the compiler.
b) Functions can't set fields defined in the struct type. This will be
rejected by the compiler:
deftype Foo =
struct
{
int field;
defun wrong = void: { field = 10; }
};
Again, remember the construction/mapping process. When a function
accesses a field of the struct type like in the example above, it is
not doing one of these pseudo `SELF.field = 10'. Instead, it is simply
updating the value of the local create in this step in Foo_mapper:
Foo_mapper:
. Map an int, put it in a local `field'.
[...]
Setting that local would impact the mapping of the subsequent fields if
they refer to `field' (for example, in their constraint expression) but
it wouldn't actually alter the value of the field `field' in the struct
value that is created and returned from the mapper!
This is very confusing, so we just disallow this with a compiler error
"invalid assignment to struct field", for your own sanity :)
c) Methods can't be used in field constraint expressions, nor in variables
or functions defined in a struct type.
How could they be? The field constraint expressions, the
initialization expressions of variables, and the functions defined in
struct types are all executed as part of the mapper/constructor and, at
that time, there is no struct value yet to pass to the method.
If you try to do this, the compiler will greet you with a "invalid
reference to struct method" message.