I've looked at one benchmark for the einstein case. (some numbers are from my memory)
compiled prolog 12-14ms
guile-log, stack-based, with added vm ops 30-40ms
kanren, compiled chicken
~250ms
guile-log, stack based, 250ms
guile-log, assq-based 500ms
kanren on guile 2000ms
kanren on guile-log assq based 2000ms
kanren on guile-log replacing the assq whith a functional lookup treee
1500ms
kanren on guile-log stack based 300ms
Note 1. kanren implementation on guile-log does not contain all of plain kanrens optimisations for assq state. This is on my TODO.
2. If we add vm ops then assq or tree based state management can be brough down
to around chickens 250ms but I would expect the stack based version to go down to 20-30ms and naitvie compilation or JIT would bring it down to 12-16ms if we usa a stack for lambdas as well assq based one would stay an 250ms cause it is inherent in the datastructures.
3. Note that this is numbers for just one use case. As discussed earlier other use-cases may lead to other conclusions.
4. I also look forward to test Marks HAMT implementation in the future.
On my TODO list now are.
1. Make sure to be able to use multiple stacks and make them resizable 2. With this I can start implementing multithreading support. 3. A retake on VM operations. 4. Making it possible to allocate a continuation lambda or a failure lambda on the stack.
for 4, I could introduce my idea. The main problem here is that you may want to move the continuation from the stack to the heap in order to be able to resore a state. In order to do this I would change the representation of sk
functions and fk functions to be pointers to the actial functional object and make sure to never store the direct function inside for example a lambda. This means that at a unwind where we know that the lambda maight be referenced via a state storage variable we copy the function, and set the reference object
to the new location. The reference object is unisized and small and could probably be allocated from the heap or be preaalocated and referensed on a stack and possible be replaced by a freshly allocated variable or cons
cell fro the heap. This means that for example the einstein example will never allocate memory from the heap just use the internal stacks. The question is if it's worth it. My experience is that this will gain you maybe a factor
of 2 or 3. Not especially much and to the price of a lot of complexity in order to make it safe.