guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

threading issues in 1.8?


From: Ken Raeburn
Subject: threading issues in 1.8?
Date: Wed, 1 Mar 2006 00:17:57 -0500

Hi. I've been starting to look at the 1.8 code, and I'm concerned that there may still be thread safety issues.

1) See my old mail about scm_leave_guile -- in fact, it's now in a comment in threads.c. :-) We're still calling setjmp, returning, and then calling the user's function. I think setjmp and the user's function need to be called from the same function, or at least the function calling setjmp should not return before the user's function is called. (And possibly any args and local variables should be "volatile".)

2) Access to newly created objects: I've been trying to figure out... if one thread allocates a new object (cons cell, say, or vector, or smob), and does a set-car! or something to make it accessible to another thread running on another processor, and the system implements a weak memory order model (so memory writes done without synchronization primitives can be seen as happening out of order), what's to cause the contents of the new pair to get written before the second thread can examine them?

Sure, we've more or less adopted the rule that without synchronization, one processor may see the "before" or "after" results of another thread's modifications, or two threads modifying an object may cause it to end up with some unpredictable but valid value. But does that work in the case of a free cell being turned into a cons cell or smob, or random heap storage being turned into a vector?

I've been doing a little reading. Not enough to have any answers yet, but enough to see that Java ran into similar issues. Bill Pugh's paper, "Implementing OO Languages under a Weak Memory Order" (http://www.cs.umd.edu/~pugh/java/memoryModel/weak.pdf) discusses much the same issue as was bothering me in Guile -- object initialization in multiprocessor systems with weak memory ordering, particularly on the Alpha. (From other stuff I've read, Alpha seems to provide the weakest coherency model of modern processors, so if we want a robust solution, it pretty much needs to target the Alpha.) His paper is largely about Java, but it seems to me that Guile needs a more precise memory model as well, if it's to be multiprocessor- multithreaded and robust. Unfortunately, I don't follow the Java world closely, but it looks like the revision of the Java memory model and thread spec is worth reading up on. I'm still looking....

Ken

P.S. For those not familiar with the Alpha, a tiny bit of background. Digital made some architectural design decisions intended to allow for aggressive performance work. One of them was the weak memory coherency model. The processor has "memory barrier" instructions which can force delayed writes to be performed, or cause changes to main memory by other processors to be seen; naturally, these are somewhat expensive, so you don't want to do them all the time. But without the memory barrier, this sort of situation is *explicitly* permitted in the architecture reference manual:

  initial state: X=1, Y=1
  processor I instruction stream: write X=2, write Y=2
  processor J instruction stream: read Y => 2, read X => 1

Even if one processor (not both) uses a memory barrier instruction in between the two operations, the reordered results are allowed. Without a barrier, processor I can perform the second write first, and without a barrier, processor J can perform the second read first.

Another interesting change in early Alpha processors was the absence of byte memory I/O operations. (Later processors, since about the EV56, have added them in as an extension.) If you wanted to modify a byte, you'd read the containing 32- or 64-bit word, modify it, and write it back; likewise for 16-bit shorts.

There are instructions to provide coordinated changes to specific locations. The "load locked" instructions load a word from storage and "locks" the address; the "store conditional" instruction stores a value if the lock is still set, and sets a flag indicating whether it worked. The lock flag can be reset by various conditions, including other reads or writes from the same processor (I think?), interrupts (including system clock), branches, and writes by other processors to the same location (or within a 2**N block containing it, where the block size is architecture-dependent, but must be at least 16 and no more than one page). If the other processor is using these instructions too, both processors can try to modify a location, but only one (at most) will succeed.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]