Lexical Variable Allocation

guile-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Lexical Variable Allocation

From:	Noah Lavine
Subject:	Lexical Variable Allocation
Date:	Thu, 5 Apr 2012 22:14:56 -0400
Hello everyone,

There was some discussion a while back about getting Guile a register
virtual machine. I was thinking about how I could help with this, and
I thought that register allocation would be a good place to start. The
great thing about it is that we already do allocation - of lexical
variable slots on the stack - so we can build and an allocator very
similar to what we'll want in a register VM before changing anything
else. That should make it easier when we do switch over. According to
a comment in module/language/tree-il/analyze.scm, the allocator isn't
very good, so it would be nice to have a better one. I'm not done yet,
but I've gotten far enough that I'd like some feedback to see if I'm
going in a productive direction or if I should change.

First of all, in order to run the code in this email, you'll want to do this:

(use-modules (srfi srfi-1) (srfi srfi-9) (srfi srfi-11) (srfi srfi-26)
(ice-9 vlist) (ice-9 match) (system base syntax) (system base message)
(system vm program) (language tree-il) (system base pmatch) (system
base compile))

That's probably too many modules, but I haven't gone through and pruned it yet.

My plan is pretty simple: at every place in the Tree-IL tree, I want
to know what lexical variables are going to be used later in the
program. When a variable disappears from this list, its spot in the
stack can be freed. This means we need a notion of what "later" and
"earlier" mean, which basically means we get to annotate our Tree-IL
tree with continuations. Here's some code to do that:

;; fill forward with the continuations of the elements of tree, and
;; reverse with the reverse continuations. both are hashes.
;; a continuation is a list of tree-il nodes.
;; if there may be more continuations we can't detect, the car of the
;; list will be the symbol 'incomplete
;; the forward hash will map the symbol 'start to the initial
;; continuations, which come from no element of tree. the reverse hash
;; will map the symbol 'end to the final continuations
(define (record-tree-continuations! tree forward reverse)
  ;; recurse returns a list of the continuations of the parent of tree
  (define (recurse tree implied)
    ;; 'implied' holds the implied continuation for tree nodes. this is
    ;; usually their parent.
    ;; c is for 'continuation'. It records continuations.
    (define (c . args)
      (hashq-set! forward tree args)
      (for-each
       (lambda (x) (hashq-set! reverse x
                          (append (hashq-ref reverse x '())
                                  (list tree))))
       args))
    ;; pre is for things that are done *before* tree, which therefore
    ;; continue to tree
    (define (pre next) (recurse next tree))
    ;; post is for things in tail position, done *after* tree
    (define (post next) (recurse next implied))
    ;; there are no subroutines in non-tail position, because those
    ;; subroutines are complete before the main node is executed
    (record-case tree
      ((<void>) (c implied) (list tree))
      ((<const>) (c implied) (list tree))
      ((<primitive-ref>) (c implied) (list tree))
      ((<lexical-ref>) (c implied) (list tree))
      ((<lexical-set> exp) (c implied) (pre exp))
      ((<module-ref>) (c implied) (list tree))
      ((<module-set> exp) (c implied) (pre exp))
      ((<toplevel-ref>) (c implied) (list tree))
      ((<toplevel-set> exp) (c implied) (pre exp))
      ((<toplevel-define> exp) (c implied) (pre exp))
      ((<conditional> test consequent alternate)
       (let ((cons-c (post consequent))
             (alt-c  (post alternate)))
         (apply c (append cons-c alt-c))
         (pre test)))
      ((<call> proc args)
       (let ((arg-c (apply append (map pre args)))
             (proc-c (pre proc)))
         (c implied)
         (append proc-c arg-c)))
      ((<primcall> args)
       (c implied)
       (apply append (map pre args)))
      ((<seq> head tail)
       (apply c (post tail))
       (pre head))
      ;; fixme: we need to handle when lambda and lambda-case are called
      ;; the easiest thing to do is just start over the
      ;; continuation-annotation at any lambda form
      ((<lambda> body)
       (c implied)
       (recurse body 'incomplete)
       (list tree))
      ((<lambda-case> body alternate)
       (c implied)
       ;; fixme: these lambda-case bodies need to be referenced by a start
       ;; tag to be traversed
       (recurse body tree)
       #f) ;; a lambda-case clause has nothing to receive its pre-continuations
      ((<let> vals body)
       ;; todo: it may be possible to return some of the body
       ;; continuations. however, you have to filter out lexical refs to
       ;; the variables defined by the let
       (apply c (post body))
       (apply append (map pre vals)))
      ((<letrec> vals body)
       (let ((body-c (post body)))
         (apply c body-c)
         (apply append (map pre vals))))
      ((<fix> vals body)
       (let ((body-c (post body)))
         (apply c body-c)
         (apply append (map pre vals))))
      ((<let-values> exp body)
       (let ((body-c (post body)))
         (apply c body-c)
         (pre exp)))))

  (let ((tree-c (recurse tree 'end)))
    (hashq-set! forward 'start tree-c)
    (for-each (lambda (x) (hashq-set! reverse x (list 'start))) tree-c)))

It doesn't handle all node types yet, but it works on a simple
example. I decided that it's actually better not to use
continuation-passing style for this. The reason is function calls - in
Scheme, the arguments to a function are evaluated in undefined order.
But in CPS (at least how I've thought of it), you have to choose an
order for the evaluation. That's clearly bad, because you're giving up
some freedom in how you generate code. So what this actually does is
annotate nodes with the next thing that must come after it. The next
thing after a function call argument is the call itself, because the
other arguments don't have to come after it. Reverse continuations are
the reverse of this - the thing before a function call argument is
whatever came before the call, not any of the other arguments.

In order to test this, you'll want a simple example. Here's one:

(define tree-1
  (compile '(let ((x 1) (y 2)) (+ x y)) #:to 'tree-il))

You can record its continuations like this:

(define forward (make-hash-table))
(define backward (make-hash-table))
(record-tree-continuations! tree-1 forward backward)

You'll also want the ability to walk a tree forwards or backwards by
continuation. Here's code for that:

;; apply func to each node N in tree, but only after func is applied to
;; every forward continuation of N
(define (traverse-backwards tree func forward reverse)
  (let iter ((already-seen '(end))
             (looking-at (hashq-ref reverse 'end '())))
    (define (ready? x)
      (lset<= eq? (hashq-ref forward x '()) already-seen))
    (unless (equal? looking-at '(start))
      ;; (simple-format #t "already-seen: ~S\nlooking-at: ~S\n"
      ;;                already-seen looking-at)
      (let*-values (((next lst) (find-remove! looking-at ready?))
                    ((new-nodes) (hashq-ref reverse next '())))
        (when (not next)
          (error "Continuation chain ends after" already-seen "looking
at" looking-at))
        (func next)
        (iter (cons next already-seen)
              (lset-union! eq? lst new-nodes))))))

(define (traverse-forwards tree func forward reverse)
  (let iter ((already-seen '(start))
             (looking-at (hashq-ref forward 'start '())))
    (define (ready? x)
      (lset<= eq? (hashq-ref reverse x '()) already-seen))
    (unless (equal? looking-at '(end))
      ;; (simple-format #t "already-seen: ~S\nlooking-at: ~S\n"
      ;;                already-seen looking-at)
      (let*-values (((next lst) (find-remove! looking-at ready?))
                    ((new-nodes) (hashq-ref forward next '())))
        (when (not next)
          (error "Continuation chain ends after" already-seen "looking
at" looking-at))
        (func next)
        (iter (cons next already-seen)
              (lset-union! eq? lst new-nodes))))))

And now you can see what the continuation chain looks like:

(traverse-forwards tree-1 (lambda (x) (display x) (newline)) forward backward)

Finally, as a step towards register allocation, I implemented the
function I originally wanted. This walks a tree and records what
lexical variables are needed after each point:

;; this is a utility function that lets us treat the two
;; lexical-variable nodes the same way
(define (tree-il-lexicals-used tree)
  (record-case tree
    ((<lexical-ref> gensym) (list gensym))
    ((<lexical-set> gensym) (list gensym))
    (else '())))

;; annotate a tree with the lexical variables that are used after (in
;; the continuation sense) every point
(define (record-lexicals-used-after! tree hash forwards backwards)
  (define (propagate-back node)
    (let ((forward-variables
           (hashq-ref hash node '())))
      (if (not (let? node))
          ;; at any node, we propagate backwards the union of the
          ;; variables from every following node, plus any variables
          ;; this node uses
          forward-variables
          ;; unless this node defined some variables, in which case they
          ;; should not propagate back further than it
          (lset-difference eq? forward-variables
                           (let-gensyms node)))))
  (define (record-node! node)
    (hashq-set! hash node
                (apply lset-union eq?
                       (tree-il-lexicals-used node)
                       (map propagate-back
                            (hashq-ref forwards node '())))))
  (traverse-backwards tree record-node! forwards backwards))

You can call it like this:

(define lexicals-needed (make-hash-table))
(record-lexicals-used-after! tree-1 lexicals-needed forward backward)

And see that it produces correct results:

(traverse-forward tree-1 (lambda (x) (display (hashq-ref
lexicals-needed x)) (newline)) forward backward)

And that's as far as I've gotten. So what do you think? Is this a
reasonable approach to allocation?

One issue I ran into is that, after looking more at analyze.scm, I'm
not sure the rest of the compiler is set up to handle an allocation
where the variables on the stack might change at any point in the
code. It seems to only want to change at lambdas. Is that accurate? If
so, how easy would it be to change that?

Also, do we want to aggressively reclaim stack space if it makes
debugging harder? We might need a way to indicate that some functions
shouldn't be compiled like this so they will be easier to debug.

Thanks,
Noah
[Prev in Thread]
Current Thread
[Next in Thread]
Lexical Variable Allocation, Noah Lavine <=
Prev by Date: Re: [Code for fun] Announce my generic server Ragnarok
Next by Date: Re: [Code for fun] Announce my generic server Ragnarok
Previous by thread: Inclusion of guile-log II
Next by thread: Re: wip-threads-and-fork
Index(es):
- Date
- Thread