qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-riscv] atomic failures on qemu-system-riscv64


From: Joel Sing
Subject: [Qemu-riscv] atomic failures on qemu-system-riscv64
Date: Fri, 31 May 2019 03:20:04 +1000
User-agent: Mutt/1.10.1 (2018-07-13)

While working on a Go (www.golang.org) port for riscv, I've run
into issues with atomics (namely LR/SC) on qemu-system-riscv64.
There are several reproducers for this problem including (using
gcc builtin atomics):

  https://gist.github.com/4a6f656c/8433032a3f70893a278259f8108aad90

And a version using inline assembly:

  https://gist.github.com/4a6f656c/d883091f5ca811822720213be343a75a

Depending on the qemu configuration the number of threads may
need increasing (to force context switching) and/or run in a
loop. Go's sync/atomic tests also fail regularly.

Having dug into the qemu code, what I believe is happening is
along the lines of the following:

1. Thread 1 runs and executes an LR - this assigns an address
   to load_res and a value to load_val (say 1).

2. A context switch occurs and thread 2 is now run - it runs
   an LR and SC on the same address modifying the stored value.
   Another LR is executed loading load_val with the current
   value (say 3).

3. A context switch occurs and thread 1 is now run again, it
   continues on its LR/SC sequence and now runs the SC instruction.
   This is based on the assumption that it had a reservation
   and the SC will fail if the memory has changed. The underlying
   implementation of SC is a cmpxchg with the value in load_val
   - this no longer has the original value and hence successfully
   compares (as does the tcg_gen_setcond_tl() between the returned
   value and load_val) thus the SC succeeds when it should not.

The diff below clears load_res when the mode changes - with this
applied the reproducers work correctly, as do Go's atomic tests.
I'm not sure this "fix" is 100% correct, but it certainly verifies
where the problem lies. It does also fall inline with the RISCV
spec:

"The SC must fail if there is an observable memory access from
another hart to the address, or if there is an intervening context
switch on this hart, or if in the meantime the hart executed a
privileged exception-return instruction."

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index b17f169681..9875b8e5d3 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -113,6 +113,8 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong 
newpriv)
     }
     /* tlb_flush is unnecessary as mode is contained in mmu_idx */
     env->priv = newpriv;
+
+    env->load_res = 0;
 }
 
 /* get_physical_address - get the physical address for this virtual address



reply via email to

[Prev in Thread] Current Thread [Next in Thread]