qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] cpus.c: Fix race condition in cpu_stop_current(


From: KONRAD Frederic
Subject: Re: [Qemu-devel] [PATCH] cpus.c: Fix race condition in cpu_stop_current()
Date: Mon, 10 Dec 2018 15:30:22 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.2

Hi Peter,

Thanks for that patch!

I'm seeing the same kind of issue when I run 8 qemu in parallel but it doesn't
seem to be fixed by this patch. Is it supposed to fix the issue when we are
doing a reset_request through a MMIO device?

It happens (rarely) with this kind of guest code:

exit:
  write to the register to reset the device
loop:
  branch loop

The code after the reset is executed.. can't we exit the loop directly with
cpu_loop_exit after cpu_exit?

Thanks,
Fred

Le 12/7/18 à 4:59 PM, Peter Maydell a écrit :
We use cpu_stop_current() to ensure the current CPU has stopped
from places like qemu_system_reset_request(). Unfortunately its
current implementation has a race. It calls qemu_cpu_stop(),
which sets cpu->stopped to true even though the CPU hasn't
actually stopped yet. The main thread will look at the flags
set by qemu_system_reset_request() and call pause_all_vcpus().
pause_all_vcpus() waits for every cpu to have cpu->stopped true,
so it can continue (and we will start the system reset operation)
before the vcpu thread has got back to its top level loop.

Instead, just set cpu->stop and call cpu_exit(). This will
cause the vcpu to exit back to the top level loop, and there
(as part of the wait_io_event code) it will call qemu_cpu_stop().

This fixes bugs where the reset request appeared to be ignored
or the CPU misbehaved because the reset operation started
to change vcpu state while the vcpu thread was still using it.

Signed-off-by: Peter Maydell <address@hidden>
---
We discussed this a little while back:
https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg00154.html
and Jaap reported a bug which I suspect of being the same thing:
https://lists.gnu.org/archive/html/qemu-discuss/2018-10/msg00014.html

Annoyingly I have lost the test case that demonstrated this
race, but I analysed it at the time and this should definitely
fix it. I have opted not to try to address any of the other
possible cleanup here (eg vm_stop() has a potential similar
race if called from a vcpu thread I suspect), since it gets
pretty tangled.

Jaap: could you test whether this patch fixes the issue you
were seeing, please?
---
  cpus.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 0ddeeefc14f..b09b7027126 100644
--- a/cpus.c
+++ b/cpus.c
@@ -2100,7 +2100,8 @@ void qemu_init_vcpu(CPUState *cpu)
  void cpu_stop_current(void)
  {
      if (current_cpu) {
-        qemu_cpu_stop(current_cpu, true);
+        current_cpu->stop = true;
+        cpu_exit(current_cpu);
      }
  }



reply via email to

[Prev in Thread] Current Thread [Next in Thread]