guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

for example


From: Tom Lord
Subject: for example
Date: Sat, 3 Aug 2002 19:20:13 -0700 (PDT)

ok -- these are input to simple awk scripts (couple k lines) -- though
I'm probably sending you versions that don't currently compile:

And yes --- I'm being cryptic (a side effect of trying to compress a
lot of info down to a short message) --- but the basic message is
"bye", so don't waste time complaining.

-t


!       Register Types

  This file is part of the source code for the Hackerlab C library.
  It's translated to C by the awk program "./register-tags.awk".

  VM registers have small _external tags_ so they can hold a 
  limited selection of unboxed values.

  This file declares the register tags and the union type
  for (various kinds of) register.

*> core-registers register-type scm_register 2
**> holds scm s
**> holds scm_u u
**> holds scm_i i
**> holds scm_f f



--------------


!       The Bit Tag Spec File

  A popular misconception is that a tagging system simply maps values
  to a set of densely-packed, small integer tags, each tag representing
  a type.  You'll see people write:

<<<
        struct generic_object_header;
        {
          t_uint tag;
          ...
        };
>>>

  but that's really _oversimplified._  A good tagging system does
  much more than that.

  For example, here is an outline that is processed automatically to
  produce `enum' declarations for a system of "staggered tags" (see
  SCM, for example), predicate functions, and case labels.

  Since scheme is important enough that we should (at least casually)
  worry about the length and complexity of the bootstrap path from
  front-panel toggle-switches to full hosting environment, this file
  is designed to processed by a small `awk' script (i.e., a script for
  a little language with hash tables, loops, conditionals, and regexps
  but not much more).

  CLS note: I'm not sure how clear this will be to people who aren't
  looking at the rest of its context.  I hope the notation is clear
  enough to be puzzled out....


*> scm-tags tags scm
**: decodes-to scm_u

  We're going to define tags with a basename "scm" for values
  of type "scm".

**> split-tag val 2

  The smallest in-line tag will be two bits.

***> tag bibop_object                            (00)
****: tags-by-mask
****: decodes-to t_scm_sptr;

  Bibop objects are the lightest weight in terms of meta-data overhead
  (e.g., they don't necessarily have reference counts) and (if a
  direct pointer representation is used) alignment requirements (they
  are 4-byte aligned)

  Bibop objects share storage with page objects (see below).

***> tag cow_bibop_object                        (..)
****: tags-by-mask
****: decodes-to t_scm_cow_sptr;

  Lazy-linear bibop objects.

  There is a one bit reference count for each bibop object.
  When the first cow reference to an object is formed, that
  reference count is 0.  If the cow reference is copied (to 
  produce a second cow reference), the count is 1.  If 
  yet another cow-copy is made, the new copy is in fact a 
  shallow-copy of the object with reference count 0 (references
  in shallow copy are cow references).  When fetching a 
  possibly cow field, programs can request a non-cow reference
  to a stable object which the field will continue to hold
  with a cow reference: if, before the fetch, the field held
  a cow reference to an object with (possibly) more than one
  cow reference, then a shallow copy is made and the field updated
  before the fetch returns.


***> split-tag heavy_pointer 2                  (..)

  Bibop pointers have the nice property of being small (if implemented
  as direct pointers) but the drawback that reclamation of objects
  weakly held by bibop pointers can not occur until a scan has updated
  all pointers to the object.

  At the opposite extreme are object-table pointers and fat pointers: 
  more or less interchangable ways to obtain cheap (in time) 
  weak references and even cheaply destroyable objects.


****> tag vm_object                             (....)
****: tags-by-mask
****: decodes-to t_scm_obj

  The heap format of an object is quite complicated and is dcoumented
  in other files.

****> tag cow_vm_object                         (....)
****: tags-by-mask
****: decodes-to t_scm_cow_obj

  Lazy-linear vm objects.  Similar to cow bibop objects,
  except that the reference count is larger.

****> tag vm_object_promise                      (....)
****: tags-by-mask
****: decodes-to t_scm_promise_obj

  Lazy, memoized, and referencer-memoized vm objects.

****> split-tag vm_page 1                       (....)
*****: tags-by-mask
*****: decodes-to t_scm_page

  A modest pool of very-large-alignment (256 bytes) types.

*****> split-tag vm_direcct_page 2               (....)

******> tag vm_page16                           (......)
******> tag vm_page128                          (......)
******> tag vm_page512                          (......)
******> tag vm_page1024                         (......)

*****> split-tag vm_cow_page 2                   (....)

******> tag vm_cow_page16                       (......)
******> tag vm_cow_page128                      (......)
******> tag vm_cow_page512                      (......)
******> tag vm_cow_page1024                     (......)


***> split-tag immediate 1                       (..)


  Characters want to be "unicode+bucky bits" which adds up to _at
  least_ 24 bit and more comfortably to 29.

  Numbers are weird.  Do we want one or two big-as-possible immediate
  integer types?  or do we want to cram in lots of little types
  for tiny immediate rationals and complex numbers?  How much
  of it should make sense in 16-bit environments?

  Atomic values: I don't care much about them.  `nil' is the
  0 non-immediate value.  I wouldn't horribly miss `#t' 
  or seeing it become a non-immediate -- almost nothing low-level
  ever dispatches on #t specifically.  Indeed -- it's easy for an
  allocator to create disjoint, immutable, non-referencing objects
  that can be re-used across all VM instances and have well-known
  fixed addresses per-process.  Use values of that sort
  for atomics: one extra memory fetch for eq? test (to look up
  the well-known address) but otherwise just as good.

  So it's a two way battle: numbers v. characters.  Characters have
  the stricter data-size demands: let's give them half the remaining
  values:


****> tag character                              (...)
*****: decodes-to t_unicode
*****: decodes-exp              (t_unicode)(   ((val >> 
scm_tag_width_character) & (((scm)1 << scm_char_code_bits) - 1)) \
                                            | (val & ((((scm)1 << 
scm_bucky_bits) - 1) << (scm_bits - scm_bucky_bits))))

  In a 32-bit or larger environment, we get 29 bits for
  immediate characters -- enough for 21-bits of 
  Unicode plus bucky-bits {left,right}x{shift,ctl,meta,alt}.

  Sweet.

  In the expanded (at least 32-bit) form, we keep the bucky bits in
  the high-order 8 bits.


****> tag immediate_signed                       (111)          signed!
*****: decodes-to scm_i


*> scm-fast-dispatchers scm

**> dispatcher is_bibop
**> return 0 for bibop_object cow_bibop_objec
**> return 1 otherwise

**> dispatcher is_vm_object
**> return 1 for vm_object cow_vm_object vm_object_promise
**> return 0 otherwise

**> dispatcher vm_obj_discipline ?
***> return cow for vm_cow_page16 vm_cow_page128 vm_cow_page512vm_cow_page1024
***> return cow for cow_vm_object cow_bibop_object
***> return promise for vm_object_promise
***> return immediate for immediate
***> return regular otherwise

        this should generate

                [extern]enum scm_vm_obj_discipline scm_vm_obj_discipline(scm? 
obj) { switch (scm_tag(..)) ... }

        and

                inline t_uint8 scm_vm_obj_discipline_switch(scm? obj) { return 
scm_tag (...); }
                #define SCM_VM_OBJ_DISCIPLINE_COW_CASE ...
                #define SCM_VM_OBJ_DISCIPLINE_PROMISE_CASE ...
                #define SCM_VM_OBJ_DISCIPLINE_IMMEDIATE_CASE ...
                #define SCM_VM_OBJ_DISCIPLINE_REGULAR_CASE ...


add a way to make pointer types for  tags (e.g. vm_object) and for any
binary dispatcher + a conversion function that return nil on "wrong
type".  (`scm_as_vm_obj(scm val) => `t_scm_vm_obj').





reply via email to

[Prev in Thread] Current Thread [Next in Thread]