guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Efficient Gensym Hack (v2)


From: Mark H Weaver
Subject: Re: [PATCH] Efficient Gensym Hack (v2)
Date: Wed, 07 Mar 2012 11:43:23 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux)

Andy Wingo <address@hidden> writes:

> On Tue 06 Mar 2012 10:55, Mark H Weaver <address@hidden> writes:
>
>> +static void
>> +set_stringbuf_shared (SCM buf)
>> +{
>> +  /* Don't modify BUF if it's already marked as shared since it
>> +     might be a read-only, statically allocated stringbuf.  */
>> +  if (!STRINGBUF_SHARED (buf))
>> +    {
>> +      scm_i_pthread_mutex_lock (&stringbuf_write_mutex);
>> +      SCM_SET_CELL_WORD_0 (buf, SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED);
>> +      scm_i_pthread_mutex_unlock (&stringbuf_write_mutex);
>> +    }
>> +}
>> +
>
> Does this work, with C's memory model?  It seems that if thread A sets
> the shared flag on stringbuf S, a concurrent call to
> set_stringbuf_shared(S) from thread B has no guarantee as to what value
> to see, as the initial flag check is outside the lock (a synchronization
> point).

That's true.  However, in that case, the shared flag is already being
set by another thread, so it doesn't matter, because the only
requirement of this function is to make sure the flag gets set.  We
could do it unconditionally if not for the existence of read-only
statically allocated stringbufs.  The flag is _never_ cleared.  Also
note that 'set_stringbuf_shared' does not return the value of the flag,
so nobody else sees this undefined value.

In fact, there are only two places where the flag is accessed:

1. %symbol-dump, which only prints it out for debugging purposes.
2. scm_i_string_start_writing, which accesses it while the
   'stringbuf_write_mutex' is locked.

> Perhaps it doesn't matter.  This is the only place that the SHARED flag
> is accessed outside of a mutex, yes?

The only other place is '%symbol-dump', which merely prints the flag's
value for debugging purposes.

> Adding Ken for thoughts on threadsafety.  If you are convinced it's
> right, please add a comment to the source code.

Given the above considerations, I'm quite confident that this is safe,
and will add a comment.  Of course Ken's thoughts are always welcome :)

>> +  return scm_double_cell (scm_tc7_symbol | SCM_I_F_SYMBOL_LAZY_GENSYM,
>> +                          SCM_UNPACK (prefix_stringbuf), (scm_t_bits) 0,
>> +                          SCM_UNPACK (scm_cons (SCM_BOOL_F, SCM_EOL)));
>
> Would be nice to avoid the plist cons if possible, but that's another
> issue.

Yes, that definitely irked me as well.  If we could get rid of the "hash
value" in the third slot, we could put the function slot there instead
and avoid the cons.

This 'equal?' hash value (the same as the hash of the name string) is
very unfortunate, because it means that a lazy gensym must be forced
whenever it's added to an 'equal?' hash table.  This shouldn't be
necessary.  Since 'equal?' compares symbols the same way that 'eq?'
does, the 'equal?' hash value should be the same as the 'eq?' hash
value.

I guess this is part of the hack to implement the weird symbol table
keyed on the symbols themselves.  But as you say, this is a separate
issue that deserves its own thread.

>> +          /* Attempt to intern the symbol */
>> +          handle = scm_hash_fn_create_handle_x (symbols, sym, SCM_UNDEFINED,
>> +                                                symbol_lookup_hash_fn,
>> +                                                symbol_lookup_assoc_fn,
>> +                                                NULL);
>> +        } while (SCM_UNLIKELY (!scm_is_eq (sym, SCM_CAR (handle))));
>
> Note that this is racy: this is a weak key hash table, so it's not safe
> to access the car of the handle outside the alloc lock.

It's not an issue here, because the only thing I'm doing with the 'car'
is checking that it's 'eq?' to the lazy gensym that's being forced.  If
it _is_ 'eq?' to our gensym, then there's no possibility that it will be
nulled out, because we hold a reference to it.  If it's _not_ 'eq?' to
our gensym, then we don't care whether it's null or not; in either case
we have failed to intern this name and we try again with the next
counter value.

However, note that 'intern_symbol', 'lookup_interned_symbol', and
'lookup_interned_latin1_symbol' all access the 'car' of a handle of the
symbol table outside of the alloc lock, and those might indeed be
problems.  Those issues are not from this patch though.  The relevant
code was last changed by you in 2011.

>> +      /* We must not clear the lazy gensym flag until our symbol has
>> +         been interned.  The lock does not save us here, because another
>> +         thread could retrieve our gensym's name or hash outside of any
>> +         lock. */
>> +      SCM_SET_CELL_WORD_0 (sym, (SCM_CELL_WORD_0 (sym)
>> +                                 & ~SCM_I_F_SYMBOL_LAZY_GENSYM));
>> +    }
>> +  scm_i_pthread_mutex_unlock (&symbols_lock);
>> +}
>
> Doing all this work within a mutex is prone to deadlock, if allocation
> causes a finalizer to run that forces another lazy symbol.

Ugh.  Well, we already have a problem then, because 'intern_symbol' also
does allocation while holding this lock, via 'symbol_lookup_assoc_fn',
which calls 'scm_symbol_to_string', which must allocate the string
object (symbols hold only stringbufs).  Therefore, with Guile 2.0.5, if
anyone calls 'scm_from_*_symbol' within a finalizer, there is already
the possibility for deadlock.

Have I mentioned lately how much I hate locks? :/

> This is not an issue on `master'.

Excellent!

> If we can get around this potential problem, then we should indeed apply
> this to stable-2.0 as well.

The good news is that it should be possible to fix this (pre-existing)
class of problems for 'symbols_lock' in stable-2.0 by changing
'symbol_lookup_assoc_fn' to avoid allocation.

It should also be possible to avoid allocation within the lock in
'scm_i_force_lazy_gensym'.

I'll work on it.

>>  #define scm_is_symbol(x)            (!SCM_IMP (x) \
>>                                       && (SCM_TYP7 (x) == scm_tc7_symbol))
>> -#define scm_i_symbol_hash(x)        ((unsigned long) SCM_CELL_WORD_2 (x))
>>  #define scm_i_symbol_is_interned(x) \
>>    (!(SCM_CELL_WORD_0 (x) & SCM_I_F_SYMBOL_UNINTERNED))
>> +#define scm_i_symbol_is_lazy_gensym(x) \
>> +  (SCM_CELL_WORD_0 (x) & SCM_I_F_SYMBOL_LAZY_GENSYM)
>>  
>>  #define SCM_I_F_SYMBOL_UNINTERNED   0x100
>> +#define SCM_I_F_SYMBOL_LAZY_GENSYM  0x200
>
> Can we make this change in stable-2.0?  It is an ABI change of sorts.

The only potential problem is if someone is already using
'scm_i_symbol_hash' in external code.  They shouldn't be doing that
because it has the 'scm_i_' prefix and is undocumented.

However, even if they are using it, in the worst case they'll get an
incorrect hash value for lazy gensyms.  The hash value they will get for
a lazy gensym will be '0', unless the gensym is currently being forced
in another thread.

> If you are convinced that we can, please surround the scm_i_* with
> #ifdef BUILDING_LIBGUILE.

Okay.

> OK, I think that's it.  I'm very much looking forward to this going in:
> on master, meta/guile examples/web/hello.scm spends 13% of its
> instructions in gensym and resulting GC foo.

Sounds good.  Thanks for the review! :)

    Mark



reply via email to

[Prev in Thread] Current Thread [Next in Thread]