[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-users] Re: How are exceptions propagated? - details on the race

From: F. Wittenberger
Subject: [Chicken-users] Re: How are exceptions propagated? - details on the race
Date: Wed, 20 Aug 2008 23:59:25 +0200

Am Mittwoch, den 20.08.2008, 08:29 +0200 schrieb felix winkelmann:
> On Tue, Aug 12, 2008 at 4:39 PM, Jörg F. Wittenberger
> <address@hidden> wrote:
> > Am Donnerstag, den 07.08.2008, 23:05 +0200 schrieb Jörg F. Wittenberger:
> >> Hi all,
> >>
> >> this is once again a slightly complicated test case.  Again I understand
> >> all calls for a simpler version.  Just I have a hard time to find one.
> >
> > I've been able to track this one down to chicken not handling bad
> > filedescriptors in ##sys#unblock-threads-for-i/o .

Since Elf expressed some doubt upon the existence of the race - which I
can understand, since race conditions are usually hard to reproduce
reliably, thus there's a good chance that my test case did not exhibit
the problem on his machine - I guess it might be good for the review, if
I comment comment on some details.

It's actually not that hard to understand the problem - that is, if we
start from the presumption that the runtime system ought to be robust to
some misuse.  After all, we have file-close at our disposal and even
without it would be all too easy to get a bad fd, at least when using


So once there's a thread waiting on a fd, which became bad in the
meantime, what's going on in the scheduler?

(define (##sys#unblock-threads-for-i/o)
  (dbg "fd-list: " ##sys#fd-list)
  (let* ([to? (pair? ##sys#timeout-list)]
         [rq? (pair? ##sys#ready-queue-head)]
         [n (##sys#fdset-select-timeout ; we use FD_SETSIZE, but really should 
use max fd
             (or rq? to?)
             (if (and to? (not rq?))    ; no thread was unblocked by timeout, 
so wait
                 (let* ([tmo1 (caar ##sys#timeout-list)]
                        [now (##sys#fudge 16)])
                   (fxmax 0 (- tmo1 now)) )
                 0) ) ] )               ; otherwise immediate timeout.
    (dbg n " fds ready")

If there's a bad fd, we shall see "-1 fds ready", the return code from

    (cond [(eq? -1 n)
             (set! ##sys#fd-list
                   (let loop ((l ##sys#fd-list))
                      ((null? l) l)
                      ((##sys#handle-bad-fd! (car l))
                       (##sys#fdset-clear (caar l))
                       ;; This is supposed to be a rare case, catch
                       ;; them one by one, not all at once
                       ;; (commented out here).
                       ;; (loop (cdr l))
                       (cdr l))
                      (else (cons (car l) (loop (cdr l)))))))

If this above case is not there, we switch to the primordial thread.

            (else (##sys#force-primordial))) ]

Now let's delay the question, whether the "else" case is handled
gracefully with the change.

(define (##sys#force-primordial)
  (dbg "primordial thread forced due to interrupt")
  ;(display "switching to primordial thread\n" debug-port)
  (##sys#thread-unblock! ##sys#primordial-thread) )

That's actually all it takes.


It all depends on the state of the primordial, there is no special
provision in ##sys#force-primordial.  In my case it was waiting in a

(define thread-join!
  (lambda (thread . timeout)
    (##sys#check-structure thread 'thread 'thread-join!)
    (let* ((limit (and (pair? timeout) (##sys#compute-time-limit
(##sys#slot timeout 0))))
           (rest (and (pair? timeout) (##sys#slot timeout 1)))
           (tosupplied (and rest (pair? rest)))
           (toval (and tosupplied (##sys#slot rest 0))) )
       (lambda (return)
         (let ([ct ##sys#current-thread])
           (when limit (##sys#thread-block-for-timeout! ct limit))
            ct 1
            (lambda ()

So it's going to continue here:

              (case (##sys#slot thread 3)
                [(dead) (apply return (##sys#slot thread 2))]
                    'condition '(uncaught-exception)
                    (list '(uncaught-exception . reason) (##sys#slot thread
7)) ) ) ) ]

and since the thread is neither dead not terminated...

                  (if tosupplied
                       (##sys#make-structure 'condition 
'())) ) ) ] ) ) )

the above case applies.  In fact I was lucky: if it had been waiting on
a mutex for a precious resource, it would have entered the critical
section.  Wherever it was, the primordial is just unblocked.


Now let's come back to the question, whether the "else" case is handled
correct.  Probably not.  I have only a Linux here right now, but man 2
select gives:

       EBADF  An invalid file descriptor was given in one of the sets.
(Perhaps a file descriptor that  was
              already closed, or one on which an error has occurred.)

       EINTR  A signal was caught.

       EINVAL nfds is negative or the value contained within timeout is

       ENOMEM unable to allocate memory for internal tables.

I believe none of them should simply activate the primordial.
EBADF is handled now.

For EINTR I have yet to understand how the signals are propagated, but
I'm afraid we need some code here too.

EINVAL would be a grave programming error in the scheduler.  Maybe it's
better to give a message and die here.  Similar for ENOMEM, though this
is not chickens fault.


The same consideration should be applied to ##sys#schedule, where the
variable "eintr" controls ##sys#force-primordial .   At the other hand,
signals are handled somehow, so probably I have overlooked something.


Now to the really interesting question: what should be done, once a
defunct fd is found?  Since ##sys#fd-list contains fd's and threads
only, the simple solution is (here a better version than in my last

(define (##sys#handle-bad-fd! e)
  (dbg "check bad" e)
  (let ((bad ((foreign-lambda*
               bool ((integer fd))
               "struct stat buf;"
               "int i = ( (fstat(fd, &buf) == -1 && errno == EBADF) ? 1 : 0);"
              (car e))))
    (if bad
         (lambda (thread)
             '(exn i/o) ;; better? '(exn i/o net)
             (list '(exn . message) "bad file descriptor"
                   '(exn . arguments) (car e)
                   '(exn . location) thread) )))
         (cdr e)))

thread-signal! them a condition.  In fact it might be better if we could
close the appropriate ports behind.  But that's easily getting messy:
the fd-list would now have to hold both, the fd and the port.  Lot's of
changes ahead.  I'd abstain.

> > The attached patch uses fstat(2) to check the fd-list.
> >
> > Unfortunately I have no idea how well this is going to be supported
> > under windows.
> Not very well, but perhaps it can at least be supported under
> UNIXish environments.

Is there no good way on windows to tell a bad fd from a good one?
Anything will do.  If worst comes to worst, we could repeat the
select(2) with just the fd in question.

best regards


reply via email to

[Prev in Thread] Current Thread [Next in Thread]