[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 06/57] accel/tcg: Honor atomicity of loads
From: |
Peter Maydell |
Subject: |
Re: [PATCH v4 06/57] accel/tcg: Honor atomicity of loads |
Date: |
Thu, 4 May 2023 18:17:29 +0100 |
On Wed, 3 May 2023 at 08:08, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Create ldst_atomicity.c.inc.
>
> Not required for user-only code loads, because we've ensured that
> the page is read-only before beginning to translate code.
>
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> +/**
> + * required_atomicity:
> + *
> + * Return the lg2 bytes of atomicity required by @memop for @p.
> + * If the operation must be split into two operations to be
> + * examined separately for atomicity, return -lg2.
> + */
> +static int required_atomicity(CPUArchState *env, uintptr_t p, MemOp memop)
> +{
> + int atmax = memop & MO_ATMAX_MASK;
> + int size = memop & MO_SIZE;
> + unsigned tmp;
> +
> + if (atmax == MO_ATMAX_SIZE) {
> + atmax = size;
> + } else {
> + atmax >>= MO_ATMAX_SHIFT;
> + }
> +
> + switch (memop & MO_ATOM_MASK) {
> + case MO_ATOM_IFALIGN:
> + tmp = (1 << atmax) - 1;
> + if (p & tmp) {
> + return MO_8;
> + }
> + break;
> + case MO_ATOM_NONE:
> + return MO_8;
> + case MO_ATOM_SUBALIGN:
> + tmp = p & -p;
> + if (tmp != 0 && tmp < atmax) {
> + atmax = tmp;
> + }
> + break;
I don't understand the bit manipulation going on here.
AIUI what we're trying to do is say "if e.g. p is only
2-aligned then we only get 2-alignment". But, suppose
p == 0x1002. Then (p & -p) is 0x2. But that's MO_32,
not MO_16. Am I missing something ?
(Also, it would be nice to have a comment mentioning
what (p & -p) does, so readers don't have to try to
search for a not very-searchable expression to find out.)
> + case MO_ATOM_WITHIN16:
> + tmp = p & 15;
> + if (tmp + (1 << size) <= 16) {
> + atmax = size;
OK, so this is "whole operation is within 16 bytes,
whole operation must be atomic"...
> + } else if (atmax == size) {
> + return MO_8;
...but I don't understand the interaction of WITHIN16
and also specifying an ATMAX value that's not ATMAX_SIZE.
> + } else if (tmp + (1 << atmax) != 16) {
Why is this doing an exact inequality check?
What if you're asking for a load of 8 bytes at
MO_ATMAX_2 from a pointer that's at an offset of
10 bytes from a 16-byte boundary? Then tmp is 10,
tmp + (1 << atmax) is 12, but we could still do the
loads at atomicity 2. This doesn't seem to me to be
any different from the case it does catch where
the first ATMAX_2-sized unit happens to be the only
thing in this 16-byte block.
The doc comment in patch 1 could probably be beefed
up to better explain the interaction of WITHIN16
and ATMAX_*.
> + /*
> + * Paired load/store, where the pairs aren't aligned.
> + * One of the two must still be handled atomically.
> + */
> + atmax = -atmax;
> + }
> + break;
> + default:
> + g_assert_not_reached();
> + }
> +
> + /*
> + * Here we have the architectural atomicity of the operation.
> + * However, when executing in a serial context, we need no extra
> + * host atomicity in order to avoid racing. This reduction
> + * avoids looping with cpu_loop_exit_atomic.
> + */
> + if (cpu_in_serial_context(env_cpu(env))) {
> + return MO_8;
> + }
> + return atmax;
> +}
> +
> +/**
> + * load_atomic8_or_exit:
> + * @env: cpu context
> + * @ra: host unwind address
> + * @pv: host address
> + *
> + * Atomically load 8 aligned bytes from @pv.
> + * If this is not possible, longjmp out to restart serially.
> + */
> +static uint64_t load_atomic8_or_exit(CPUArchState *env, uintptr_t ra, void
> *pv)
> +{
> + if (HAVE_al8) {
> + return load_atomic8(pv);
> + }
> +
> +#ifdef CONFIG_USER_ONLY
> + /*
> + * If the page is not writable, then assume the value is immutable
> + * and requires no locking. This ignores the case of MAP_SHARED with
> + * another process, because the fallback start_exclusive solution
> + * provides no protection across processes.
> + */
> + if (!page_check_range(h2g(pv), 8, PAGE_WRITE)) {
> + uint64_t *p = __builtin_assume_aligned(pv, 8);
> + return *p;
> + }
This will also do a non-atomic read for the case where
the guest has mapped the same memory twice at different
addresses, once read-only and once writeable, I think.
In theory in that situation we could use start_exclusive.
But maybe that's a weird corner case we can ignore?
> +#endif
> +
> + /* Ultimate fallback: re-execute in serial context. */
> + cpu_loop_exit_atomic(env_cpu(env), ra);
> +}
> +
thanks
-- PMM
- Re: [PATCH v4 04/57] accel/tcg: Reorg system mode load helpers, (continued)
- [PATCH v4 05/57] accel/tcg: Reorg system mode store helpers, Richard Henderson, 2023/05/03
- [PATCH v4 11/57] tcg/tci: Use helper_{ld,st}*_mmu for user-only, Richard Henderson, 2023/05/03
- [PATCH v4 12/57] tcg: Add 128-bit guest memory primitives, Richard Henderson, 2023/05/03
- [PATCH v4 13/57] meson: Detect atomic128 support with optimization, Richard Henderson, 2023/05/03
- [PATCH v4 06/57] accel/tcg: Honor atomicity of loads, Richard Henderson, 2023/05/03
- Re: [PATCH v4 06/57] accel/tcg: Honor atomicity of loads,
Peter Maydell <=
[PATCH v4 08/57] target/loongarch: Do not include tcg-ldst.h, Richard Henderson, 2023/05/03
[PATCH v4 07/57] accel/tcg: Honor atomicity of stores, Richard Henderson, 2023/05/03
[PATCH v4 10/57] accel/tcg: Implement helper_{ld, st}*_mmu for user-only, Richard Henderson, 2023/05/03