qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Lockup with --accel tcg,thread=single


From: Alex Bennée
Subject: Re: Lockup with --accel tcg,thread=single
Date: Mon, 30 Sep 2019 20:20:58 +0100
User-agent: mu4e 1.3.4; emacs 27.0.50

Paolo Bonzini <address@hidden> writes:

> On 30/09/19 17:37, Peter Maydell wrote:
>> Not sure currently what the best fix is here.
>
> Since thread=single uses just one queued work list, could it be as
> simple as

Does it? I thought this was the case but:

  static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
  {
      qemu_mutex_lock(&cpu->work_mutex);
      if (cpu->queued_work_first == NULL) {
          cpu->queued_work_first = wi;
      } else {
          cpu->queued_work_last->next = wi;
      }
      cpu->queued_work_last = wi;
      wi->next = NULL;
      wi->done = false;
      qemu_mutex_unlock(&cpu->work_mutex);

      qemu_cpu_kick(cpu);
  }

Does seem to imply the vCPU CPUState is where the queue is. That's not
to say there shouldn't be a single work queue for thread=single.

>
> diff --git a/cpus.c b/cpus.c
> index d2c61ff..314f9aa 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1539,7 +1539,7 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>              cpu = first_cpu;
>          }
>
> -        while (cpu && !cpu->queued_work_first && !cpu->exit_request) {
> +        while (cpu && !first_cpu->queued_work_first && !cpu->exit_request) {
>
>              atomic_mb_set(&tcg_current_rr_cpu, cpu);
>              current_cpu = cpu;
>
> or something like that?
>
>> Side note -- this use of run_on_cpu() means that we now drop
>> the iothread lock within memory_region_snapshot_and_clear_dirty(),
>> which we didn't before. This means that a vCPU thread can now
>> get in and execute an access to the device registers of
>> hw/display/vga.c, updating its state in a way I suspect that the
>> device model code is not expecting... So maybe the right answer
>> here should be to come up with a fix for the race that 9458a9a1
>> addresses that doesn't require us to drop the iothread lock in
>> memory_region_snapshot_and_clear_dirty() ? Alternatively we need
>> to audit the callers and flag in the documentation that this
>> function has the unexpected side effect of briefly dropping the
>> iothread lock.
>
> Yes, this is intended.  There shouldn't be side effects other than
> possibly a one-frame flash for all callers.
>
> Paolo


--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]