guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Immediate doubles (up to 2^256) and rationals coming to Guile 3


From: Mark H Weaver
Subject: Re: Immediate doubles (up to 2^256) and rationals coming to Guile 3
Date: Tue, 11 Jun 2019 20:03:26 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux)

Hi Arne,

Arne Babenhauserheide <address@hidden> writes:

> I don’t understand the implications of this, but I realized that I would
> love to have info about the sizes of different datastructures in Guile.

Sure.

> I recently started measuring how much memory is required for a record
> compared to a list, so I could give people concrete guidelines which
> datastructures to use for which algorithm, and for that such low-level
> details would be great to have! (or just to know where to read them up —
> or which keywords to look for)

For core data structures implemented in C, the place to look is in the
corresponding *.h and *.c files in libguile, for example struct.[ch],
strings.[ch], numbers.[ch], etc.  However, pairs receive special
handling, and only part of their implementation is in pairs.[ch].

I will summarize the costs of some important structures below:

In the following, a "word" is the size of a pointer (4 bytes on a 32-bit
system, or 8 bytes on a 64-bit system).

I will _not_ include the SCM word itself in the accounting of the size
of the object represented by that SCM.  That's because for
heap-allocated objects, the SCM is simply a pointer to the object, and
there can be many pointers to the same object.  The cost of SCM words
will instead be included in the cost of the aggregate objects that
contain them.

In this method of accounting, immediate objects cost nothing, because
they are entirely stored within the SCM itself.

Pairs cost 2 words.  Lists cost 2 words per element.  The empty list is
immediate, and therefore costs nothing.

In current 'master', records cost 1 word per field, plus another word,
rounded up to the nearest multiple of 2 words.  In 2.0 and 2.2, they
cost 1 word per field, plus another _2_ words, rounded up

Vectors cost 1 word per element, plus another word, rounded up to the
nearest multiple of 2 words.

Bytevectors cost 4 words plus the data itself, rounded up to the nearest
multiple of 2 words.  As a special case, empty bytevectors cost nothing.

Non-immediate inexact real numbers cost 16 bytes.  Non-immediate complex
numbers cost 24 bytes on 32-bit systems, or 32 bytes on 64-bit systems.

In the common case, strings currently cost 6 words plus the character
data itself, rounded up to the nearest multiple of 2 words.  The
character data costs either 1 byte per character (if all code points are
less than 256, i.e. the string is latin-1), or 4 bytes per character.
(However, I may propose a new string representation for 3.0).

As a special case, the empty string costs nothing.

Finally, I should mention that the ability to share substructure in
lists and pairs make them potentially more memory efficient than
vectors, depending on the use case.  For example, if you need a
structure with 4 fields, and a common operation in your application will
involve copying the structure except for one field which is changed,
then you might be able save memory by using lists, even though a
4-element list requires more memory than a 4-element vector or struct.
If you put the commonly-changed field in the CAR, then you can update
that field by allocating only a single pair (2 words) and sharing the
rest of the structure with the old version.  This is not possible with
vectors or structs, where you must allocate a fresh new object of 5 or 6
words to perform the same operation.

     Regards,
       Mark



reply via email to

[Prev in Thread] Current Thread [Next in Thread]