[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [GNUnet-developers] threads & deadlocks
From: |
Christian Grothoff |
Subject: |
Re: [GNUnet-developers] threads & deadlocks |
Date: |
Thu, 23 Dec 2004 06:24:10 +0100 |
User-agent: |
KMail/1.5.4 |
On Mittwoch, 22. Dezember 2004 23:10, Jussi Eloranta wrote:
> Hello,
>
> I have been experiencing deadlocks with gnunet-gtk on macos x.
> This happens during searches where one gets many replies.
>
> After some heuristic debugging, I found that the following change in
> util/semaphore.c resolved the problem (look at the comment line):
>
> int semaphore_up_(Semaphore * s,
> const char * filename,
> const int linenumber) {
> int value_after_op;
> pthread_cond_t * cond;
>
> GNUNET_ASSERT_FL(s != NULL, filename, linenumber);
> cond = s->cond;
> #if DEBUG_SEMUPDOWN
> LOG(LOG_DEBUG,
> "semaphore_up %p enter at %s:%d\n",
> s,
> filename,
> linenumber);
> #endif
> MUTEX_LOCK(&(s->mutex));
> (s->v)++;
> value_after_op = s->v;
> /* The lines below were originally in opposite order */
> GNUNET_ASSERT(0 == pthread_cond_signal(cond));
> MUTEX_UNLOCK(&(s->mutex));
>
> #if DEBUG_SEMUPDOWN
> LOG(LOG_DEBUG,
> "semaphore_up %p exit at %s:%d\n",
> s,
> filename,
> linenumber);
> #endif
> return value_after_op;
> }
>
> After this change it works (look at the comment line in the code). I
> don't immediately see why this would
> change anything but in practice it does... Or may be it enforces that
> signal and wait calls will "match"?
Yes, this is a bug & you fixed it. The issue is that if there happens to be a
very particular interleaving in the scheduler, bad things could happen with
the old ordering (which can lead to the pthread_cont having no effect because
the other guy is not yet waiting, but the other thread starting to wait for
the signal). I wonder why this one was never caught before. Good catch!
Thanks!
Christian