qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] tcg: drop qemu_cpu_is_self() in tlb_flush_by_mmuidx[_async_w


From: Igor Mammedov
Subject: Re: [PATCH] tcg: drop qemu_cpu_is_self() in tlb_flush_by_mmuidx[_async_work]
Date: Fri, 24 Jan 2025 15:21:48 +0100

On Thu, 23 Jan 2025 12:23:43 +0100
Igor Mammedov <imammedo@redhat.com> wrote:

> On Thu, 23 Jan 2025 10:52:15 +0000
> Alex Bennée <alex.bennee@linaro.org> wrote:
> 
> > Igor Mammedov <imammedo@redhat.com> writes:
> >   
> > > QEMU will crash with following debug enabled
> > >   # define DEBUG_TLB_GATE 1
> > >   # define DEBUG_TLB_LOG_GATE 1
> > > due to [1] introduced assert and as it 
> > > happenstlb_flush_by_mmuidx[_async_work]
> > > functions are called not only from vcpu thread but also from reset handler
> > > that is called from main thread at cpu realize time when vcpu is already
> > > created
> > >   x86_cpu_new -> ... ->
> > >       x86_cpu_realizefn -> cpu_reset -> ... ->
> > >           tcg_cpu_reset_hold
> > >
> > > drop assert to fix crash.    
> > 
> > Hmm the assert is there for a good reason because we do not want to be
> > flushing another CPUs state. However the assert itself:
> > 
> >   g_assert(!(cpu)->created || qemu_cpu_is_self(cpu));
> > 
> > was trying to account for pre-initialised vCPUs. What has changed?
> > 
> > cpu_thread_signal_created(cpu) is called just before we start running
> > the main loop in mttcg_cpu_thread_fn. So any other thread messing with
> > the CPUs TLB can potentially mess things up.  
> 
> it reproduces on current master, so yes it likely has changed over time.
> I've just stumbled on it when attempting to get rid of cpu->created usage.
> 
> 
> > > 1)
> > > Fixes: f0aff0f124028 ("cputlb: add assert_cpu_is_self checks")

bisection points to (so above fixes was a wrong one):
  30933c4fb4f3d tcg/cputlb: remove other-cpu capability from TLB flushing
which has replaced a check with assert:

-    if (cpu->created && !qemu_cpu_is_self(cpu)) {
-        async_run_on_cpu(cpu, tlb_flush_by_mmuidx_async_work,
-                         RUN_ON_CPU_HOST_INT(idxmap));
-    } else {
-        tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(idxmap));
-    }
+    assert_cpu_is_self(cpu);

should we revert that instead?

perhaps also drop 'cpu->created' check in  assert_cpu_is_self as it
obviously doesn't work.


> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > ---
> > >  accel/tcg/cputlb.c | 4 ----
> > >  1 file changed, 4 deletions(-)
> > >
> > > diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> > > index b26c0e088f..2da803103c 100644
> > > --- a/accel/tcg/cputlb.c
> > > +++ b/accel/tcg/cputlb.c
> > > @@ -381,8 +381,6 @@ static void tlb_flush_by_mmuidx_async_work(CPUState 
> > > *cpu, run_on_cpu_data data)
> > >      uint16_t all_dirty, work, to_clean;
> > >      int64_t now = get_clock_realtime();
> > >  
> > > -    assert_cpu_is_self(cpu);
> > > -
> > >      tlb_debug("mmu_idx:0x%04" PRIx16 "\n", asked);
> > >  
> > >      qemu_spin_lock(&cpu->neg.tlb.c.lock);
> > > @@ -419,8 +417,6 @@ void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t 
> > > idxmap)
> > >  {
> > >      tlb_debug("mmu_idx: 0x%" PRIx16 "\n", idxmap);
> > >  
> > > -    assert_cpu_is_self(cpu);
> > > -
> > >      tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(idxmap));
> > >  }    
> >   
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]