guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The relationship between SCM and scm_t_bits.


From: Dirk Herrmann
Subject: Re: The relationship between SCM and scm_t_bits.
Date: Fri, 21 May 2004 21:37:36 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.2) Gecko/20040220

Marius Vollmer wrote:

 Dirk Herrmann <address@hidden> writes:

> I have not yet given it a try, but I found the suggestion to use a
> union quite appaling:

 [ I think you mean "appealing". :-) I used to mix up the adjectives
 "content" and "contempt"... [ And its "appalling" with double-el. I
 hope you don't mind this little public correction. [ I think I can
 get away with it since I make a ton of mistakes myself... ] ] ]

:-) No problem, thanks for the hint. In fact, I did not even know the
word "appalling". Reminds me of a situation when I read the word
"hostile" and thought it came from "host". Think about someone
thanking his host for their "hostility" :-)

> [...]
>
> typedef struct scm_t_cell { union { scm_t_bits word; SCM object; }
> elements[]; } scm_t_cell;

 Yes, but consider how we use the heap: we fetch a word and then must
 decide whether it is a SCM or a scm_t_bits, we don't know this in
 advance in every case. This is not really supported by a union: I
 don't think you can store into one member and then (portably) assume
 anything about the value read from a different member. This is very
 much like storing into one memory location thru one pointer and
 reading that same location through a differently-typed pointer. I
 therefore don't think that using a union is an improvement.

I don't see a problem here: The rule is, if you don't know better in advance,
always access your memory as a scm_t_bits variable. This is exactly the
way we determine, whether a cell really holds a pair: As long as it is just
a cell, we check the bits. Only if we know its a pair, we dare to access it
as a pair of SCM values.

 Thus, I think we are better off by just declaring the heap words to
 be of type SCM and always accessing them as this type. Converting
 between SCM and scm_t_bits will happen with SCM_PACK and SCM_UNPACK.
 That way, we don't need to assume that a SCM and a scm_t_bits are
 stored identically in memory.

Then, again, we have to rather stay on the safe side and assume to have only
scm_t_bits variables on the heap: If a variable of type SCM and a variable of
type scm_t_bits would _really_ look different, then the heap _must_ hold
elements of type scm_t_bits, since all non-pair objects can store arbitrary
data in their cells. Thus, in such a case accessing the heap via SCM pointers
would be plain wrong.

However, I would not be too restrictive:
I don't think that the distinction between SCM and scm_t_bits should go in
the direction that SCM and scm_t_bits might be represented in completely
different ways: It was introduced as a means to provide better type checking
in guile. On that way it brought (almost coincidentally) a nice distinction
between code that operates on higher levels and code that doesn't. The fact
that some code does not yet use that abstraction barrier correctly (it may
be that this is the case for scm_mark_locations, which you gave as an
example) could also mean that this code needs to be fixed.

Another, more general note:

The whole discussion only came up since there are places in guile or in
client code where people want to access the heap via pointers. Before we
adapt one of our central structures for such uses, we should first think,
whether that usage is correct or not. In the context of generational gc,
I think we should be very careful about such uses. Let's rather try to get
rid of such code, and encourage users to do the same.

Note that, ..._WORD_LOC write accesses may be perfectly safe: If the
data that is being pointed to does not hold scheme objects and also no
other data that introduces gc-relevant dependencies, you can safely write
to the heap in this way. The acess in numbers.h that I modified in my
patch for example is no problem: The heap holds only references to
gmp-data, no references back into the heap.

On the contrary, ..._OBJECT_LOC write accesses are always a problem
with respect to generational gc.

Best regards
Dirk





reply via email to

[Prev in Thread] Current Thread [Next in Thread]