guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Pausable continuations


From: Stefan Israelsson Tampe
Subject: Re: Pausable continuations
Date: Sun, 13 Feb 2022 10:34:43 +0100

The case with A simple loop of 20M operations are now down to 0.3 s that's almost 20X improvements over
the best delimited continuation example (6s). Cpython takes 0.5s!

On Fri, Feb 11, 2022 at 1:10 PM Stefan Israelsson Tampe <stefan.itampe@gmail.com> wrote:
Hmm, I can improve the delimited continuation speed slightly by doing the below code


(define prompt (list 1))
(define (f2)
  (let lp ((i 0))
    (when (< i 20000000)
      (begin
        (abort-to-prompt prompt)
        (lp (+ i 1)))))
  #f)

 ; 5.906402s real time, 12.297234s run time.  8.894807s spent in GC.

So we are actually around 12X faster.

(define (test2)
  (let lp ((k f2))
    (let ((k (call-with-prompt prompt k (lambda (k) k))))
      (if k (lp k) #f))))


On Fri, Feb 11, 2022 at 1:06 PM Stefan Israelsson Tampe <stefan.itampe@gmail.com> wrote:
I managed to make jitted code work for an example, speeds up the code up 2x. So in 1s ther is 40M ops per s
overhead in the generator construct, that's essentially 4x slower the fastest it can do in a very simple loop. And matches 
pythons generators and are 15x faster than the example code I have above.

On Thu, Feb 10, 2022 at 4:19 PM Stefan Israelsson Tampe <stefan.itampe@gmail.com> wrote:
I did some benchmark, consider this code below. Let's turn off the jit. Then
a 20M loop using normal delimited continuations yields,

;; 7.866898s real time, 14.809225s run time.  9.652291s spent in GC

With a pausing continuation or generator we end up with,
;; 0.965947s real time, 0.965588s run time.  0.000000s spent in GC.

python 3's similar generator example is executing at 0.5s for the same looop.
so using delimited continuations to model pythons generators we have an overhead of around 15X.

With jit,
;; 6.678504s real time, 13.589789s run time.  9.560317s spent in GC.

So we can't really get any speedup help from guile's jit here. The paused jit version is not available as I have not figured out how to do this jet.

(define prompt (list 1))
(define (f)
  (let lp ((i 0))
    (when (< i 20000000)
      (begin
        (abort-to-prompt prompt)
        (lp (+ i 1))))))
 
(define (test2)
  (let lp ((k f))
    (call-with-prompt prompt k lp)))



On Thu, Feb 10, 2022 at 2:07 PM Stefan Israelsson Tampe <stefan.itampe@gmail.com> wrote:
Consider a memory barrier idiom constructed from 
0, (mk-stack)
1. (enter x)
2. (pause x)
3. (leave x)

The idea is that we create a separate stack object and when entering it, we will swap the current stack with the one in the argument saving the current stack in x  and be in the 'child' state and move to a paused position in case of a pause, when pausing stack x, we will return to where after where entered saving the current position in stack and ip, and be in state 'pause' and when we leave we will be in the state 'leave and move to the old stack, using the current
ip. At first encounter the function stack frame is copied over hence there will be a fork limited to the function only.

This means that we essentially can define a generator as
(define (g x)
  (let lp ((n 0))
    (if (< n 10)
        (begin
           (pause x)
           (lp (+ n 1))))))

And use it as
(define (test)
    (let ((x (mk-stack)))
        (let lp ()
           (case (enter x)
               ((pause)
                   (pk 'pause)
                   (lp))
                ((child)
                 (g x)
                 (leave x))))))))

A paused or leaved stack cannot be paused, an entered stack cannot be entered and one cannot leave a paused stack, but enter a leaved stack.

Anyhow this idea is modeled like a fork command instead of functional and have the benefit over delimited continuations that one does not need to copy the whole stack and potentially speed up generator like constructs. But not only this, writing efficient prolog code is possible as well. We could simplify a lot of the generation of prolog code, speed it up and also improve compiler speed of prolog code significantly.

How would we approach the  prolog code. The simplest system is to use return the 
alternate pause stack when succeeding things becomes very simple,

x   = stack to pause to in case of failure
cc = the continuation

(<and> (x cc)  goal1 goal2)  
     :: (cc (goal1 (goal2 x))

(<or >   (x cc)  goal1 goal2)  
    ::  (let ((xx (mkstack)))
             (case (enter xx)
                 ((child)
                  (cc (goal2 xx)))

                ((pause)
                 (cc (goal2 x)))))

Very elegant, and we also can use some heuristics to store already made stacks when 
leaving a stack and reuse at the next enter which is a common theme in prolog,

Anyhow we have an issue, consider the case where everythings succeds forever. Then we will blow the stack . There is no concept of tail calls here. So what you can do is the following for an <and>,

(let ((xx (mk-stack)))
    (case (enter xx)
      ((child)
       (goal1 x (lambda (xxx) (pause xx xxx)))
       
      ((pause xxx)
         (goal2 xxx cc))))

This enable cuts so that a cutted and (and!) in kanren lingo will use
(goal2 x cc)

And we have tail calls!


I have a non jitted version guile working as a proof of concept. 

The drawback with this is if a function uses a lot of stack, it will be a memory hog.

WDYT?











.

          

reply via email to

[Prev in Thread] Current Thread [Next in Thread]