[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Question on memory commit during MR finalize()
From: |
Peter Xu |
Subject: |
Re: Question on memory commit during MR finalize() |
Date: |
Mon, 20 Apr 2020 19:31:15 -0400 |
On Mon, Apr 20, 2020 at 11:44:11PM +0200, Paolo Bonzini wrote:
> On 20/04/20 23:00, Peter Xu wrote:
> >
> > I'm still uncertain how the dirty ring branch can easily trigger this,
> > however
> > the backtrace looks really odd to me in that we're going to do memory commit
> > and even sending KVM ioctls during finalize(), especially in the RCU
> > thread...
> > I never expected that.
>
> Short answer: it is really hard to not trigger finalize() from an RCU
> callback, and it's the reason why the RCU thread takes the big QEMU lock.
>
> However, instead of memory_region_transaction_commit,
> memory_region_finalize probably should do something like
>
> --memory_region_transaction_depth;
> assert (memory_region_transaction_depth ||
> (!memory_region_update_pending &&
> !ioeventfd_update_pending));
Ah I see; this makes sense.
And finally I found the problem, which is indeed the bug in my own tree - I
forgot to remove the previous changes to flush the dirty ring during mem
removal (basically that's run_on_cpu() called during a memory commit, that will
wrongly release the BQL without being noticed).
Besides above assert, I'm thinking maybe we can also assert on something like:
!(memory_region_transaction_depth || memory_region_update_pending ||
ioeventfd_update_pending)
When releasing BQL (unlock, or qemu_cond_wait() on BQL, which should cover
run_on_cpu()), so that we can identify misuse of BQL easier like this.
Let me know if you like these sanity checks. I can write up a small series if
you think that's a good idea.
Thanks,
--
Peter Xu