gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gcl-devel] Two word cons


From: Camm Maguire
Subject: [Gcl-devel] Two word cons
Date: 18 May 2005 11:02:34 -0400
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

[ background for gcl-devel readers -- I';ve recently returned from a
visit to UT/Austin where there are quite a few serious GCL users,
mostly as a base lisp behind ACL2.  I'll try to make a full report on
this visit soon.  One of the suggestions raised were the excessive
memory requirements of GCL, esp. its current 3 word cons. ]

Greetings!

Found a little time, and managed to get a two word cons working on
version 2.7.0 (CVS head).  Appears to be passing all tests with all GC
options, e.g. SGC, optimize-maximum-pages, etc.   There are doubtless
a few issues remaining somewhere.  But it looks quite doable.

I'm not sure the approach I took is best, so would like to consult.
I've taken advantage of the fact that even on 32bit machines, all our
structures are at least 2 words long, and can therefore be 8 byte
aligned, giving three mark bits.  The least significant indicates that
the whole word is a traditional type word, the next is the GC mark
bit, and the the next is a bit indicating the object is free.  This
allows basically all pointer indirection to proceed without masking as
pointers to be indirected in a cons only have bits set in the GC --
not that this is important from a performance point of view, but it
would take considerable work to rewrite the compiler to put the masks
in everywhere.  The only explicit masking-prior-to-indirection that
needs doing is in the GC (e.g. mark_cons), which is thankfully
well-localized.  This requires, however, that all odd word structures
on 32bit machines be padded by one word.  This is most wasteful for
the other three word structs (complex, ratio, ...) which are now 4
words long (on 32bit only), but my (completely unsubstantiated) hunch
is that this wastage is dwarfed by the cons savings.

Here is what (room) looks like (p means struct has been padded by one
word, m means trimmed by one word):

(2 words)                    m
   800/1352   30.2%         CONS FIXNUM SHORT-FLOAT CHARACTER RANDOM-STATE 
READTABLE SPICE
(10 words)                   p
   102/301    87.3%         SYMBOL
(14 words)                   p
     1/2      21.9%         PACKAGE
(8 words)                    p        p        p       p                p       
 p      
   106/306    49.3%         ARRAY HASH-TABLE VECTOR BIT-VECTOR STREAM PATHNAME 
CCLOSURE CLOSURE
(4 words)                    p                 p       p        p
   267/321     9.0%         STRUCTURE BIGNUM RATIO LONG-FLOAT COMPLEX CFUN
(6 words)                    p     p     p     p   p     p
    56/276    88.2%         SFUN STRING GFUN VFUN AFUN CFDATA

   612/768                1 contiguous (150 blocks)
       13107                hole
       5242    0.0%         relocatable

      1332 pages for cells
     20293 total pages
     99672 pages available
     11107 pages in heap but not gc'd + pages needed for gc marking
    131072 maximum pages


While some of these might be compressed further, there is definitely
no compression room for the 4 word structs, the most wasteful of which
is likely the structure structs.  In some 4 word cases, the compile of
course can inline the variables and pass them around on the stack.
Either I figure out how to live without a free object bit, or we
conclude that the cons load greatly dominates in all real world
situations, and that 64bit is the medium term future, where there is
only savings and no waste.

There is also a minor consequence for SGC.  SGC basically selects a
subset of pages to work with and marks the rest read-only.  All old
objects on the working set at the time sgc is turned on can never be
freed, as the mark might have to proceed via a read-only page, which
the algorithm skips for efficiency.  These SGC_NORMAL vs. SGC_RECENT
objects were designated by yet another bit in the type word.  16 byte
alignment is definitely too wasteful IMHO, so there is no room for
this on cons (only), in which case we only claim totally free pages for
sgc, and use the sgc page flag to effectively determine SGC_RECENT
cons from SGC_NORMAL.  

Secondly, and perhaps more importantly, we discussed how ld.so puts
the shared libraries at 0x40000000 on Linux for example, limiting or
corrupting a big heap depending on the robustness of GCL's
algorithms.  The way around this appears to be via a linker script,
using a PT_LOAD entry in the program header to make a section taking
no ram or disk space but occupying and effectively reserving the
desired area.  I should have more information on this soon.

Take care, 


"Warren A. Hunt Jr." <address@hidden> writes:

> Hi Camm,
> 
> Here are some of the things we discussed.
> 
> Cheers,
> 
> Warren
> ++++++
> 
>                          Items Discussed
> 
>  1.   Hash CONS (HONS).
>    a. Weak Hash
>    b. Randomize FIXNUM hashing
>  2.   Clear understanding of (ROOM T)
>  3.   Unbox FIXNUM or CONS for GC
>  4.   Threads
>  5.   Complier Emit Function Signatures and Boxing
>  6.   Upon function redefintion, flush all function properties
>  7.   Mutual Recursion
>  8.   Bigger FIXNUM, si::allocate-bigger-fixnum ?
>  9.   Bigger PageSize
> 10.   Fold xgcl into the standard build
> 11.   Replace #n# tables with dynamic tables
> 12.   Eliminate the compile-time maximum pages
> 13.   Place shared libraries elsewhere in memory
> 14.   Fix gethash and sethash code
> 
> 
> enum httest {                 /*  hash table key test function  */
>       htt_eq,                 /*  eq  */
>       htt_eql,                /*  eql  */
>       htt_equal               /*  equal  */
> };
> 
> struct htent {                        /*  hash table entry  */
>       object  hte_key;        /*  key  */
>       object  hte_value;      /*  value  */
> };
> 
> struct hashtable {            /*  hash table header  */
>               FIRSTWORD;
>       struct htent
>               *ht_self;       /*  pointer to the hash table  */
>       object  ht_rhsize;      /*  rehash size  */
>       object  ht_rhthresh;    /*  rehash threshold  */
>       // WAH,Jr. -- At creation and extension, recomput this next number.
>       int     ht_int_thres    /*  Interger number of maxium entries */
>       int     ht_nent;        /*  number of entries  */
>       int     ht_size;        /*  hash table size  */
>       short   ht_test;        /*  key test function  */
>                               /*  of enum httest  */
> };
> 
> 
> struct htent *
> gethash(key, hashtable)
> object key;
> object hashtable;
> {
>       enum httest htest;
>       int hsize;
>       struct htent *e;
>       object hkey;
>       int i=0, j = -1, k; /* k added by chou */
>       bool b=FALSE;
> 
>       htest = (enum httest)hashtable->ht.ht_test;
>       hsize = hashtable->ht.ht_size;
> 
>       // WAH,Jr. --  Make "/ 4" into ">> WORDSIZE_IN_BYTES"
>       if (htest == htt_eq)
>               i = (long)key / 4;
>       // WAH,Jr. --  Pull out FIXNUM and CHARACTER tests
>       else if (htest == htt_eql)
>               i = hash_eql(key);
>       else if (htest == htt_equal)
>               i = ihash_equal(key,0);
>       // WAH,Jr. --  Fix constant below.
>       i &= 0x7fffffff;
>       // WAH,Jr. --  Restructure with two simple loops, don't use MOD, don't 
> need k
>       for (i %= hsize, k = 0; k < hsize;  i = (i + 1) % hsize, k++) { /* k 
> added by chou */
>               e = &hashtable->ht.ht_self[i];
>               hkey = e->hte_key;
>               if (hkey == OBJNULL) {
>                       if (e->hte_value == OBJNULL)
>                               if (j < 0)
>                                       return(e);
>                               else
>                                       return(&hashtable->ht.ht_self[j]);
>                       else
>                               if (j < 0)
>                                       j = i;
>                               else if (j==i)
>                                 /* this was never returning --wfs
>                                    but looping around with j=0 */
>                                 return(e) 
>                                       ;
>                       continue;
>               }
>       // WAH,Jr. --  Eliminate these tests each time around the loop.
>               if (htest == htt_eq)
>                       b = key == hkey;
>               else if (htest == htt_eql)
>                       b = eql(key, hkey);
>               else if (htest == htt_equal)
>                       b = equal(key, hkey);
>               if (b)
>                       return(&hashtable->ht.ht_self[i]);
>       }
>       return(&hashtable->ht.ht_self[j]);      /* added by chou */
> }
> 
> 
> static void
> extend_hashtable(object);
> 
> void
> sethash(key, hashtable, value)
> object key, hashtable, value;
> {
>       int i;
>       bool over=FALSE;
>       struct htent *e;
>       
>       i = hashtable->ht.ht_nent + 1;
>       // WAH,Jr.  Test for excess size by simple integer comparison.
>       if (type_of(hashtable->ht.ht_rhthresh) == t_fixnum)
>               over = i >= fix(hashtable->ht.ht_rhthresh);
>       else if (type_of(hashtable->ht.ht_rhthresh) == t_shortfloat)
>               over =
>               i >= hashtable->ht.ht_size * sf(hashtable->ht.ht_rhthresh);
>       else if (type_of(hashtable->ht.ht_rhthresh) == t_longfloat)
>               over =
>               i >= hashtable->ht.ht_size * lf(hashtable->ht.ht_rhthresh);
>       if (over)
>               extend_hashtable(hashtable);
>       e = gethash(key, hashtable);
>       if (e->hte_key == OBJNULL)
>               hashtable->ht.ht_nent++;
>       e->hte_key = key;
>       e->hte_value = value;
> }
>       
> static void
> extend_hashtable(hashtable)
> object hashtable;
> {
>       object old;
>       int new_size=0, i;
> 
>       if (type_of(hashtable->ht.ht_rhsize) == t_fixnum)
>               new_size = 
>               hashtable->ht.ht_size + fix(hashtable->ht.ht_rhsize);
>       else if (type_of(hashtable->ht.ht_rhsize) == t_shortfloat)
>               new_size = 
>               hashtable->ht.ht_size * sf(hashtable->ht.ht_rhsize);
>       else if (type_of(hashtable->ht.ht_rhsize) == t_longfloat)
>               new_size = 
>               hashtable->ht.ht_size * lf(hashtable->ht.ht_rhsize);
>       {BEGIN_NO_INTERRUPT;    
>       old = alloc_object(t_hashtable);
>       old->ht = hashtable->ht;
>       vs_push(old);
>       hashtable->ht.ht_self = NULL;
>       hashtable->ht.ht_size = new_size;
>       if (type_of(hashtable->ht.ht_rhthresh) == t_fixnum)
>               hashtable->ht.ht_rhthresh =
>               make_fixnum(fix(hashtable->ht.ht_rhthresh) +
>                           (new_size - old->ht.ht_size));
>       hashtable->ht.ht_self =
>       (struct htent *)alloc_relblock(new_size * sizeof(struct htent));
>       for (i = 0;  i < new_size;  i++) {
>               hashtable->ht.ht_self[i].hte_key = OBJNULL;
>               hashtable->ht.ht_self[i].hte_value = OBJNULL;
>       }
>       for (i = 0;  i < old->ht.ht_size;  i++) {
>               if (old->ht.ht_self[i].hte_key != OBJNULL)
>                       sethash(old->ht.ht_self[i].hte_key,
>                               hashtable,
>                               old->ht.ht_self[i].hte_value);
>       }
>       hashtable->ht.ht_nent = old->ht.ht_nent;
>       vs_popp;
>       END_NO_INTERRUPT;}
> }
> 
> 
> 

-- 
Camm Maguire                                            address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah




reply via email to

[Prev in Thread] Current Thread [Next in Thread]