guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The relationship between SCM and scm_t_bits.


From: Marius Vollmer
Subject: Re: The relationship between SCM and scm_t_bits.
Date: Sat, 21 Aug 2004 18:16:45 +0200
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3 (gnu/linux)

Dirk Herrmann <address@hidden> writes:

>>  The reason is that there exits code that does essentially this:
>>
>>  scm_t_bits heap_field;
>>
>>  SCM value = whatever (); SCM *ptr = (SCM *)&heap_field; *ptr = value;
>
> I assume that you mean that heap_field is actually an element of the heap.

Yes.

> We already had the discussion that I suggest to discourage this
> style of coding since it violates a potential write barrier and will
> lead to problems if we ever switch to a generational garbage
> collection.

Yes, that is the bigger issue.  What we are discussing here are quite
minor points, I'd say.  There might be a time when we do want to have
a write-barrier and then we can revisit whether to provide the *LOC
accessors or not.  Right now, removing them is not necessary.  We
should only remove them when there is an immediate benefit.

> In particular, I have a problem with the following lines of code.
>
>   In gc.h:
>
>     #define SCM_GC_CELL_WORD(x, n)   (SCM_UNPACK (SCM_GC_CELL_OBJECT
> ((x), (n))))
>
>     This expression has a SCM value as an intermediate result, which
> is definitely unclean, since the SCM value might (in contrast to the
> definition of SCM) not represent a valid scheme object.

Yes, that troubles me also a bit.  But I get over it by realizing that
we only really have one type, the type 'machine word', and SCM and
scm_t_bits are essentially this same type, used to provide markup for
different uses of the basic type 'machine word'.  (In my view, it is
essential that Scheme values are represented as a machine word.  Using
some other type that doesn't fit into a machine register, for example,
would not be good enough.)

As far as the ordinary user is concerned, we only have one type to
represent a Scheme value, SCM.  We don't say what a SCM is (whether it
is a pointer, an integer, a struct, etc), only that you can assign it
with '='.

The internals of Guile, and unfortunately also a user that works with
smobs, need to know more about SCM: that it really is a machine word
and can be treated as an integral type.  To treat it as such, a SCM is
reinterpreted as a scm_t_bits.

I think we need to make the following guarantees:

  - a SCM and a scm_t_bits have the same size in the sense that they
    can store exactly the same things.  We always have

       SCM scm;
       scm_is_eq (SCM_PACK (SCM_UNPACK (scm)), scm)

    and

       scm_t_bits bits;
       SCM_UNPACK (SCM_PACK (bits)) == bits                     (*)

  - a size_t can be cast to scm_t_bits and back without losing
    information.  (This is for storing integers in heap words.)

  - a void* can be cast to scm_t_bits and back without losing
    information.  (This is for storing pointers in heap words.)

  - a scm_t_bits can be cast to void* and back without losing
    information.  (This is for storing SCMs in void* locations
    provided by external code.)

This is not as elegant and clean as dropping the guarantee (*), but it
allows heap words to be declared as type SCM which is desirable since
local variables and function arguments are also declared to be of type
SCM.


The reason that SCM is distinct from scm_t_bits at all is to get some
help from the C compiler in type checking.

>   In numbers.h:
>
>     #define SCM_I_BIG_MPZ(x) (*((mpz_t *) (SCM_CELL_OBJECT_LOC((x),1))))
>
>     This expression has a SCM* as an intermediate result, although in
> this case we _know_ that we are actually pointing to a scm_t_bits
> value.

No, we point at an array of three SCMs... ;) This is actually a
separate issue: the memory used by SCM_I_BIG_MPZ is always used as
only one type, as an mpz_t.

The reason that I changed all heap words to be declared as SCM was
that previously some heap words would be written as a SCM and then
read as a scm_t_bits.  This is also the reason why I think that a
union does not help at all: with such a union, we would write into one
member and then read from the other.  This is just as unclean as
casting a pointer to scm_t_bits to a pointer to SCM.

> Thus, I would just go ahead and apply it within the next couple of
> days.

Please do not apply it.  We are not completely clean, true, but I
doubt that we can attain perfect cleanliness anyway.  Using a union
would just complicate the issue without giving any benefit (that I
could see).


Things started out simple, and got more complicated with the
introduction of scm_t_bits as an alias of SCM.  Let's not continue
this trend by pretending that SCM and scm_t_bits are actually separate
types.  They are not, they are the same type essentially, but one
allows certain low-level operations that the other prevents.

-- 
GPG: D5D4E405 - 2F9B BCCC 8527 692A 04E3  331E FAF8 226A D5D4 E405




reply via email to

[Prev in Thread] Current Thread [Next in Thread]