qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 06/57] accel/tcg: Honor atomicity of loads


From: Richard Henderson
Subject: Re: [PATCH v4 06/57] accel/tcg: Honor atomicity of loads
Date: Fri, 5 May 2023 21:19:16 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0

On 5/4/23 18:17, Peter Maydell wrote:
+    case MO_ATOM_SUBALIGN:
+        tmp = p & -p;
+        if (tmp != 0 && tmp < atmax) {
+            atmax = tmp;
+        }
+        break;

I don't understand the bit manipulation going on here.
AIUI what we're trying to do is say "if e.g. p is only
2-aligned then we only get 2-alignment". But, suppose
p == 0x1002. Then (p & -p) is 0x2. But that's MO_32,
not MO_16. Am I missing something ?

You're right, this is missing a ctz32().


(Also, it would be nice to have a comment mentioning
what (p & -p) does, so readers don't have to try to
search for a not very-searchable expression to find out.)

+    case MO_ATOM_WITHIN16:
+        tmp = p & 15;
+        if (tmp + (1 << size) <= 16) {
+            atmax = size;

OK, so this is "whole operation is within 16 bytes,
whole operation must be atomic"...

+        } else if (atmax == size) {
+            return MO_8;

...but I don't understand the interaction of WITHIN16
and also specifying an ATMAX value that's not ATMAX_SIZE.

I'm trying to describe e.g. LDP, which if not within16 has two 8-byte elements, one or both of which must be atomic. We will have set MO_ATOM_WITHIN16 | MO_ATMAX_8.

If atmax == size, there is only one element, and since it is not within16, there is no atomicity.

+        } else if (tmp + (1 << atmax) != 16) {

Why is this doing an exact inequality check?
What if you're asking for a load of 8 bytes at
MO_ATMAX_2 from a pointer that's at an offset of
10 bytes from a 16-byte boundary? Then tmp is 10,
tmp + (1 << atmax) is 12, but we could still do the
loads at atomicity 2. This doesn't seem to me to be
any different from the case it does catch where
the first ATMAX_2-sized unit happens to be the only
thing in this 16-byte block.

If the LDP is aligned mod 8, but not aligned mod 16, then both 8-byte operations must be (separately) atomic, and we return MO_64.

+            /*
+             * Paired load/store, where the pairs aren't aligned.
+             * One of the two must still be handled atomically.
+             */
+            atmax = -atmax;

... whereas returning -MO_64 tells the caller that we must handle an unaligned atomic operations.

+    /*
+     * If the page is not writable, then assume the value is immutable
+     * and requires no locking.  This ignores the case of MAP_SHARED with
+     * another process, because the fallback start_exclusive solution
+     * provides no protection across processes.
+     */
+    if (!page_check_range(h2g(pv), 8, PAGE_WRITE)) {
+        uint64_t *p = __builtin_assume_aligned(pv, 8);
+        return *p;
+    }

This will also do a non-atomic read for the case where
the guest has mapped the same memory twice at different
addresses, once read-only and once writeable, I think.
In theory in that situation we could use start_exclusive.
But maybe that's a weird corner case we can ignore?

We don't handle multiple mappings at all well. There is an outstanding bug report about read+write vs read+execute mappings for a jit -- our write-protect scheme for flushing TBs does not work for that case.

Since we can't detect the multiple mappings at this point, I'm tempted to 
ignore it.
But you're correct that we could drop this check and let start_exclusive handle 
it.



r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]