guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

data-crunching in guile


From: Andy Wingo
Subject: data-crunching in guile
Date: Wed, 24 Jun 2009 14:03:10 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.92 (gnu/linux)

Hey all,

I'm pretty displeased about the way that the brainfuck compilation
works. Check this:

    brainfuck@(guile-user)> ,c ++]

(The trailing ] is to note the end of the program. It's kindof a hack,
but it works fine.)

    Disassembly of #<objcode b719c8e0>:

       0    (make-int8:0)                   ;; 0
       1    (load-symbol "make-vector")     ;; make-vector
      16    (link-now)                      
      17    (variable-ref)                  
      18    (make-int16 117 48)             ;; 30000
      21    (make-int8:0)                   ;; 0
      22    (call 2)                        
      24    (local-set 1)                   
      26    (local-set 0)                   

OK, up to here we're good, it's a generic prolog to make the tape and
pointer.

      28    (load-symbol "vector-set!")     ;; vector-set!
      43    (link-now)                      
      44    (variable-ref)                  
      45    (local-ref 1)                   
      47    (local-ref 0)                   
      49    (load-symbol "vector-ref")      ;; vector-ref
      63    (link-now)                      
      64    (variable-ref)                  
      65    (local-ref 1)                   
      67    (local-ref 0)                   
      69    (call 2)                        
      71    (make-int8:1)                   ;; 1
      72    (add)                           
      73    (mv-call 3 :L32)                ;; MV -> 81
      77    (drop)                          
      78    (br :L33)                       ;; -> 84
      81    (truncate-values 0 0)           
      84    (load-symbol "vector-set!")     ;; vector-set!
      99    (link-now)                      
     100    (variable-ref)                  
     101    (local-ref 1)                   
     103    (local-ref 0)                   
     105    (load-symbol "vector-ref")      ;; vector-ref
     119    (link-now)                      
     120    (variable-ref)                  
     121    (local-ref 1)                   
     123    (local-ref 0)                   
     125    (call 2)                        
     127    (make-int8:1)                   ;; 1
     128    (add)                           
     129    (goto/args 3)      

Then all of this to add 2 to a cell of a vector!!!

The fact that we have to resolve vector-set! and vector-ref, indeed at
every `+', is due to the fact that the code is not wrapped in a lambda,
and thus has no place to cache the looked up var locations. See
toplevel-ref in vm.texi, for more info.

But then at every call to vector-set! (except the last one of course),
we have to set up a multiple-values context, in the possibility that
vector-ref returns multiple values. In brainfuck, which is an imperative
language, most calls are for effect, not tail calls or calls for value,
meaning that mv calls proportionally take up many instructions.

And all this for vector-set!, which only returns a value because it has
to (in the C api).

But I was thinking, the brainfuck pattern (random-access reads and
writes to a vector) is really the pattern of data crunching. The
compiled output of brainfuck programs is the same as the compiled output
of a hashing algorithm or an image synthesizer. And these data-crunching
operations are precisely those that are closest to the underlying
hardware, for better or for worse.

What I'm getting at is that I think we should have VM ops for working on
vectors -- both generic vectors, and specific ops for bytevectors, and
probably an op for string-ref as well, and possibly string-set!. Then a
native code backend could be effectively implemented to operate on the
GLIL or assembly level, relying on the Tree-IL compiler's previous
resolution of high-level operations (i.e., vector-set!) to low-level
instructions. I think we have the space in the VM, and if we group all
of the vector instructions at the end, we shouldn't affect the
instruction cache too much.

So that's my plan. Thoughts?

Cheers,

Andy
-- 
http://wingolog.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]