Good post!
I first like to second the need for our own gc.
Regarding mark procedures. I prefer that we collect the use cases and make sure that the framework can handle them in
a robust and effective way. I don't like to write mark procedures at all and your fixes seam to address quite a lot of my own
hack using mark procedures.
So let's see if we can support guile-log's logical programs. I can run quite advanced programs from the
prolog world like constraint programming in finite domains. To run these programs you need gc'ing of
logical variables. The problem with an assoc datastructure to represent prolog variables are that they
keep a reference to all assigned variables and you loose the ability to gc them. kanren has a few trick
that can help but they are of the kind "if your lucky you will not blow the stack" so I can't use that. What
I do is to maintain all logical asignement in a global stack and work with that hands on.
One of the essential features is to detect when a variable in the stack is ready for garbage collection. How to solve this?
This is my plan
finalizers:
if v is in the stack, then I would mark them as in the stack. In the finilizer I would look if the variable is in the stack
and then mark it as ready for reclaim, and then keep them as referenced and also clear the data. It would then be good
if we could get hook to the sweep phase so that we can scan the prolog stack, and remove the variables not reachable but in the stack.
This means that we could benefit from a secondary reclaim feature of memory else the variables would stay in memory until
the next gc.
The problem is that these stacks needs to mark some fields and not other fields. This can be fixed by protecting added stuff and finaliziers
but it will slow down the handling. But this pattern can be coded what you need is a function that tells if a field should be marked or not
we could for example add a pair of a vector and a bitvector where the bitvector is telling about gc or not similarly to your struct suggestion
Another interesting gc feature is that when you set a variable multiple times you can clear all data untill the last one after
a backtracking point until the next one, but the backtracking point can be garbage collected as well. So I need to scan the stack to see if there is data
that can be reclaimed at each sweep phase.This means it can take multiple gc's to reclaim all potential memory and I'm not happy
as it is now - When working on this I would probably introduce a doubly linked list between the backtracking points and maintain them
so essentially when you mark the backtracking point I need to scan the stack to the next backtracking point and mark all the last reseted
values. Doing this will be a good solution that does not need multiple gc.
Another issue is that a backtracking point can be located in a state tree. Here again I need to maintain links correctly so that I can scan
the list and mark reseted values just as explained for the stack.
One way around this issue is maintain a library in guile that implements a general datastructure that can be used as a prolog stack. Or some
other infrastructure that can be used. The pattern is indeed interesting to try to generalize.
What I do fear though is that the gc of prolog programs with lot's of state stored will lead to quite a big rucksack of finilizer logic. The state is stored
in trees and I need to maintain them on the basis of special gc treatment. A solution again would be to store a bit tree that tells about the marking of the
elements in the tree so we would have a third data-structure where we would like to conditionally mark elements. This means that we could use a finalizer
on the backtracking object.
Finally a question.
If in a finilizer I mark objects. Would I then be able mark what they reference in turn and so on?
Thinking about this maybe we could use structs to shield of gc but it can have an awful overhead. I would love to have a bit in SCM
values that tell if they need special gc or no special gc e.g. if this bit is marked you do your work in the finilizer where you can force
a mark onto the variable although the bit is set.
Regards
Stefan