[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-stable] [PATCH] qed: fix bdrv_qed_drain
From: |
Fam Zheng |
Subject: |
Re: [Qemu-stable] [PATCH] qed: fix bdrv_qed_drain |
Date: |
Tue, 23 Feb 2016 13:57:04 +0800 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Wed, 02/17 12:28, Paolo Bonzini wrote:
>
>
> On 17/02/2016 03:57, Fam Zheng wrote:
> > On Tue, 02/16 16:53, Paolo Bonzini wrote:
> >> The current implementation of bdrv_qed_drain can cause a double
> >> completion of a request.
> >>
> >> The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
> >> unconditionally, but this is not correct if an allocating write
> >> is queued. In this case, qed_unplug_allocating_write_reqs will
> >> restart the allocating write and possibly cause it to complete.
> >> The aiocb however is still in use for the L2/L1 table writes,
> >> and will then be completed again as soon as the table writes
> >> are stable.
> >>
> >> The fix is to only call qed_plug_allocating_write_reqs and
> >> bdrv_aio_flush (which is the same as the timer callback---the patch
> >> makes this explicit) only if the timer was scheduled in the first
> >> place. This fixes qemu-iotests test 011.
> >>
> >> Cc: address@hidden
> >> Cc: address@hidden
> >> Cc: Stefan Hajnoczi <address@hidden>
> >> Signed-off-by: Paolo Bonzini <address@hidden>
> >> ---
> >> block/qed.c | 13 +++++++------
> >> 1 file changed, 7 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/block/qed.c b/block/qed.c
> >> index 404be1e..ebba220 100644
> >> --- a/block/qed.c
> >> +++ b/block/qed.c
> >> @@ -380,12 +380,13 @@ static void bdrv_qed_drain(BlockDriverState *bs)
> >> {
> >> BDRVQEDState *s = bs->opaque;
> >>
> >> - /* Cancel timer and start doing I/O that were meant to happen as if it
> >> - * fired, that way we get bdrv_drain() taking care of the ongoing
> >> requests
> >> - * correctly. */
> >> - qed_cancel_need_check_timer(s);
> >> - qed_plug_allocating_write_reqs(s);
> >> - bdrv_aio_flush(s->bs, qed_clear_need_check, s);
> >> + /* Fire the timer immediately in order to start doing I/O as soon as
> >> the
> >> + * header is flushed.
> >> + */
> >> + if (s->need_check_timer && timer_pending(s->need_check_timer)) {
> >
> > We can assert(s->need_check_timer);
>
> I've seen it NULL, but didn't check why. This was also a source of
> segmentation faults.
>
> >> + qed_cancel_need_check_timer(s);
> >> + qed_need_check_timer_cb(s);
> >> + }
> >
> > What if an allocating write is queued (the else branch case)? Its completion
> > will be in bdrv_drain and it could arm the need_check_timer which is wrong.
> >
> > We need to drain the allocating_write_reqs queue before checking the timer.
>
> You're right, but how? That's what bdrv_drain(bs) does, it's a
> chicken-and-egg problem.
Maybe use an aio_poll loop before the if?
Fam