[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: thoughts on targetting the web

From: Christine Lemmer-Webber
Subject: Re: thoughts on targetting the web
Date: Wed, 06 Oct 2021 10:53:21 -0400
User-agent: mu4e 1.6.6; emacs 27.2

Chris Lemmer-Webber <> writes:

> O M G!  So excited to read this email.  Yes!

So, re-poiking this thread.  I'm still just as excited as I was when I
sent this some time ago!

Any updates?  How can we help?  Can we test things?  Poke at something?
Contribute something?'

I'm excited for this future!

> Andy Wingo writes:
>> Hi :)
>> A brain-dump tonight.  I was thinking about what it would mean for Guile
>> to target the web.  The goal would be for Guile to be a useful language
>> for programming client-side interaction on web pages.  Now, there are a
>> number of takes for Schemes of various stripes to target JS.  I haven't
>> made a full survey but my starting point is that in a Guile context, we
>> basically want the Guile Scheme language, which is to say:
>>  1. Tail calls.  The solution must include proper tail calls.
>>  2. Multiple values.  The solution should have the same semantics that
>>     Guile has for multiple-value returns, including implicit
>>     truncation.
>>  3. Varargs.  The solution should allow for rest arguments, keyword
>>     arguments, and the like.
>>  4. Delimited continuations.  The solution should allow non-local exit
>>     from any function, and should allow for capturing stack slices to
>>     partial continuations, and should allow those continuations to be
>>     instantiated multiple times if desired.  We should be able to target
>>     fibers to the web.  Otherwise, why bother?
>>  5. Garbage collection.  *We should re-use the host GC*.  Although it
>>     would be possible to manage a heap in linear memory, that has
>>     retention problems due to cycles between the Guile heap and the JS
>>     heap.
> It sounds like from when we spoke not long ago that progress towards the
> WASM-GC proposal has been happening I think?  So this seems feasible?
> However, I do think that a "proof of concept" GC would still be an
> advancement.  Like, even lifting the dead-simple version from SICP as a
> stopgap.  We'll all know it's not ideal, but it'll still be progress.
> Getting somethign going and showing it is helpful.
> That said, I also know how meh such a proposed approach can feel. ;)
>>  6. Numbers.  We should support Scheme's unbounded integers.  Requires
>>     bignums in the limit case, but thankfully JS has bignums now.  We
>>     will still be able to unbox in many cases though.
>>  7. Target existing web.  We shouldn't depend on some future WebAssembly
>>     or JS proposal -- the solution should work in the here and now and
>>     only get better as features are added to the web platform.
> Yes.  Though, to jump back again to when you and I talked offline, I
> think what I recommended was that as an early proof of concept, a good
> idea might be to just have a small number of APIs, or even bet, make a
> mini "fantasy console".  Here's an interesting example (and the one we
> talked about, off-list):
> What's interesting about this first is that it's just a Racket program
> that generates quasiquoted s-expression representation WASM.  Now,
> obviously your compiler will do more interesting things than that.  But
> the second interesting thing is the "fantasy console" component... a set
> of custom interfaces to render a simple 8-bit aesthetic console.
> My argument for this is: supporting "the existing web" is a big step.
> Targeting a much smaller number of APIs, and being able to show the
> world a demo that Guile's WASM compiler can do something cool... that's
> very motivating, I think.
> But an even simpler hello world would be to simply have access to
> console.log, and write a scheme program that does ye olde Fibonacci
> sequence and spits out the results on the page.  Not as exciting, but
> that's the reasonable (and classic scheme) first step I think.
> But yes, target the existing web... eventually!  Just don't get so hung
> up on worrying about those interfaces that it distracts from making
> something that can generate some WASM, is my thoughts.
>> From a UX perspective, I would expect we would generally want
>> whole-program compilation with aggressive DCE / tree-shaking, producing
>> a single JS or WebAssembly artifact at build-time.  But that's a later
>> issue.
> Yes, focus on things working first... major optimizations can come later!
>> I have thought about this off and on over the years but in the end was
>> flummoxed about how to meet all requirements.  However recently I think
>> I have come up with a solution for most of these:
>>  (1) In JS, tail calls are part of ES2015, but not implemented in all
>>      browsers.  In WebAssembly, they are a future plan, but hard for
>>      various reasons.  So in practice the solution for today's web is to
>>      use a universal trampoline and make all calls tail calls --
>>      i.e. call all functions via:
>>        while true:
>>          call(pop())
>>      Whenever we can target proper tail calls, this will only get
>>      faster.
>>  (2) Neither JS nor WebAssembly have the truncation semantics we want.
>>      Therefore we will return values via an explicit array, and then the
>>      continuation will be responsible for consuming those values and
>>      binding any needed variables.
>>  (3) Although JS does have varargs support, WebAssembly does not.  But
>>      we can actually use the same solution here as we use for (1) and
>>      (2) -- pass arguments on an explicit array + nvalues, and relying
>>      on the function to parse them appropriately.  In this way we can
>>      get Guile keyword arguments.  This also makes the type of Scheme
>>      functions uniform, which is important in WebAssembly contexts.
>>  (4) This is the interesting bit!  As hinted in (1), we will transform
>>      the program such that all calls are tail calls.  This is a form of
>>      minimal CPS conversion -- minimal in the sense that functions are
>>      broken up into the minimum number of pieces to ensure the
>>      all-tail-calls property.  Non-tail calls transform to tail calls by
>>      saving their continuation's state on a stack, which is the same as
>>      stack allocation for these continuations.  The continuation of the
>>      call pops the saved state from the stack.  Because the stack is
>>      explicit, we can manipulate it as data: slice it to capture a
>>      delimited continuation, drop values to make a non-local exit, splat
>>      a saved slice back on to compose a captured delimited continuation
>>      with the current continuation, and so on.  Therefore we have the
>>      necessary primitives to implement delimited continuations as a
>>      library.
> I think I understand 15% of this, but the amount I could absorb in the
> limited time I spent stewing on it sounds cool anyway.  Neat!  Most
> importantly, you have a good idea that you think will work, so yay!
>>  (5) Scheme needs a representation that can store any value.  In JS this
>>      is easy because JS is untyped.  For WebAssembly, I think I would
>>      lean on externref for this purpose, which effectively boxes all
>>      values.  There are no fixnums in the current WebAssembly spec, so
>>      this is suboptimal, and we have to call out to JS to evaluate type
>>      predicates and access fields.  But, support for structured
>>      GC-managed types is coming to WebAssembly, which will speed up
>>      object access.
>>  (6) The easy solution here is to make JS numbers, which are doubles at
>>      heart, represent flonums, and use JS bignums for Scheme integers.
>>      Fine.
>>  (7) This principle has been taken into account in (1)-(6).
>> Now, a note on the transformation described in (4), which I call
>> "tailification".
>> The first step of tailification computes the set of "tails" in a
>> function.  The function entry starts a tail, as does each return point
>> from non-tail calls.  Join points between different tails also start
>> tails.
>> In the residual program, there are four ways that a continuation exits:
>>   - Tail calls in the source program are tail calls in the residual
>>     program; no change.
>>   - For non-tail calls in the source program, the caller saves the state
>>     of the continuation (the live variables flowing into the
>>     continuation) on an explicit stack, and saves the label of the
>>     continuation.  The return continuation will be converted into a
>>     arity-checking function entry, to handle multi-value returns; when
>>     it is invoked, it will pop its incoming live variables from the
>>     continuation stack.
>>   - Terms that continue to a join continuation are converted to label
>>     calls in tail position, passing the state of the continuation as
>>     arguments.
>>   - Returning values from a continuation pops the return label from the
>>     stack and does an indirect tail label call on that label, with the
>>     given return values.
>> I expect that a tailified program will probably be slower than a
>> non-tailified program.  However a tailified program has a few
>> interesting properties: the stack is packed and only contains live data;
>> the stack can be traversed in a portable way, allowing for
>> implementation of prompts on systems that don't support them natively;
>> and as all calls are tail calls, the whole system can be implemented
>> naturally with a driver trampoline on targets that don't support tail
>> calls (e.g. JavaScript and WebAssembly).
>> This algorithm is implemented in the wip-tailify branch in Guile's git.
>> I have this little driver program:
>>     (use-modules (system base compile)
>>                  (system base language)
>>                  (language cps tailify)
>>                  (language cps verify)
>>                  (language cps renumber)
>>                  (language cps dce)
>>                  (language cps simplify)
>>                  (language cps dump))
>>     (define (tailify* form)
>>       (let ((from 'scheme)
>>             (make-lower (language-lowerer (lookup-language 'cps)))
>>             (optimization-level 2)
>>             (warning-level 2)
>>             (opts '()))
>>         (define cps
>>           ((make-lower optimization-level opts)
>>            (compile form #:from 'scheme #:to 'cps #:opts opts
>>                     #:optimization-level optimization-level
>>                     #:warning-level warning-level
>>                     #:env (current-module))
>>            (current-module)))
>>         (format #t "Original:\n")
>>         (dump cps)
>>         (format #t "\n\nTailified:\n")
>>         (let* ((cps (tailify cps))
>>                (cps (verify cps))
>>                (cps (eliminate-dead-code cps))
>>                (cps (simplify cps))
>>                (cps (renumber cps)))
>>           (dump cps))))
>> If I run this on the following lambda:
>>   (lambda (x)
>>     (pk x
>>         (if x
>>             (pk 12)
>>             (pk 42))))
>> Then first it prints the dump of the optimized CPS:
>>   Original:
>>   L0:
>>     v0 := self
>>     L1(...)
>>   L1:
>>     receive()
>>     v1 := current-module()                      ; module 
>>     cache-set![0](v1)
>>     v2 := const-fun L7                          ; _ 
>>     return v2
>>   L7:
>>     v3 := self
>>     L8(...)
>>   L8:
>>     v4 := receive(x)                            ; x 
>>     v5 := cache-ref[(0 . pk)]()                 ; cached 
>>     heap-object?(v5) ? L16() : L11()
>>   L16():
>>     L17(v5)
>>   L11():
>>     v6 := cache-ref[0]()                        ; mod 
>>     v7 := const pk                              ; name 
>>     v8 := lookup-bound(v6, v7)                  ; var 
>>     cache-set![(0 . pk)](v8)
>>     L17(v8)
>>   L17(v9):                                      ; box 
>>     v10 := scm-ref/immediate[(box . 1)](v9)     ; arg 
>>     false?(v4) ? L22() : L19()
>>   L22():
>>     v12 := const 42                             ; arg 
>>     call v10(v12)
>>     L25(receive(arg, rest...))
>>   L19():
>>     v11 := const 12                             ; arg 
>>     call v10(v11)
>>     L25(receive(arg, rest...))
>>   L25(v13, v14):                                ; tmp, tmp 
>>     tail call v10(v4, v13)
>> Then it prints the tailified version:
>>   L0:
>>     v0 := self
>>     L1(...)
>>   L1:
>>     receive()
>>     v1 := current-module()                      ; module 
>>     cache-set![0](v1)
>>     v2 := const-fun L8                          ; _ 
>>     v3 := restore1[ptr]()                       ; ret 
>>     tail calli v3(v2)
>>   L8:
>>     v4 := self
>>     L9(...)
>>   L9:
>>     v5 := receive(x)                            ; x 
>>     v6 := cache-ref[(0 . pk)]()                 ; cached 
>>     heap-object?(v6) ? L17() : L12()
>>   L17():
>>     L18(v6)
>>   L12():
>>     v7 := cache-ref[0]()                        ; mod 
>>     v8 := const pk                              ; name 
>>     v9 := lookup-bound(v7, v8)                  ; var 
>>     cache-set![(0 . pk)](v9)
>>     L18(v9)
>>   L18(v10):                                     ; box 
>>     v11 := scm-ref/immediate[(box . 1)](v10)    ; arg 
>>     false?(v5) ? L24() : L20()
>>   L24():
>>     v14 := const 42                             ; arg 
>>     v15 := code L37                             ; cont 
>>     save[(scm scm ptr)](v5, v11, v15)
>>     tail call v11(v14)
>>   L20():
>>     v12 := const 12                             ; arg 
>>     v13 := code L29                             ; cont 
>>     save[(scm scm ptr)](v5, v11, v13)
>>     tail call v11(v12)
>>   L29:
>>     L30(...)
>>   L30:
>>     v16, v17 := receive(arg, rest...)           ; tmp, tmp 
>>     v18, v19 := restore[(scm scm)]()            ; restored, restored 
>>     tail callk L34(_, v18, v19, v16)
>>   L34:
>>     meta: arg-representations: (scm scm scm)
>>     L35(...)
>>   L35(v20, v21, v22):                           ; _, _, _ 
>>     tail call v21(v20, v22)
>>   L37:
>>     L38(...)
>>   L38:
>>     v23, v24 := receive(arg, rest...)           ; tmp, tmp 
>>     v25, v26 := restore[(scm scm)]()            ; restored, restored 
>>     tail callk L34(_, v25, v26, v23)
>> I think I would like to implement the "save" and "restore" primcalls on
>> stock Guile, to be able to adequately test tailification.  But then I'd
>> look at targetting WebAssembly (just to make it interesting; run-time
>> performance would be faster no doubt if I targetted JS, due to GC
>> concerns).
>> Anyway, thoughts welcome.  Happy hacking :)
>> Andy
> This is all great.  I think getting something that works well enough to
> write some proof-of-concept demos in is the real goal.  If you can make
> something that can compile just enough to WASM where me adding the
> fantasy console stuff would help (I might just lift wasm-adventure's),
> then I think we could make some pretty cool demos for the world.  Maybe
> we could even have a special guile-fantasy-console gamejam? ;)
> Anyway exciting, exciting, exciting!  My main advice in all of it: don't
> let great be the enemy of good-enough, and all that.  Getting to the
> point where we're able to do *something* is still useful, if the hooks
> are in place to iterate from good-enough to great over time.
> Yay!
>  - Chris

reply via email to

[Prev in Thread] Current Thread [Next in Thread]