[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-stable] [PATCH v2] virtio-blk: Fix double completion for werro
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-stable] [PATCH v2] virtio-blk: Fix double completion for werror=stop |
Date: |
Tue, 17 Nov 2015 17:30:47 +0800 |
User-agent: |
Mutt/1.5.23 (2015-06-09) |
On Tue, Nov 17, 2015 at 03:28:29PM +0800, Fam Zheng wrote:
> On Tue, 11/17 14:58, Stefan Hajnoczi wrote:
> > On Mon, Nov 16, 2015 at 02:10:36PM +0800, Fam Zheng wrote:
> > > When a request R is absorbed by request M, it is appended to the
> > > "mr_next" queue led by M, and is completed together with the completion
> > > of M, in virtio_blk_rw_complete.
> > >
> > > With error policy equals stop, if M has an I/O error, now R also gets
> > > prepended to the per device DMA restart queue, which will be retried
> > > when VM resumes. It leads to a double completion (in symptoms of memory
> > > corruption or use after free).
> > >
> > > Adding R to the queue is superfluous, only M needs to be in the queue.
> > >
> > > Fix this by marking request R as "merged" and skipping it in
> > > virtio_blk_handle_rw_error.
>
> The commit message is outdated. "merged" requests are actually skipped in
> virtio_blk_handle_request.
>
> > >
> > > Cc: address@hidden
> > > Signed-off-by: Fam Zheng <address@hidden>
> > >
> > > ---
> > >
> > > v2: Don't lose the request in migration. [Paolo]
> > > ---
> > > hw/block/virtio-blk.c | 7 +++++++
> > > include/hw/virtio/virtio-blk.h | 1 +
> > > 2 files changed, 8 insertions(+)
> > >
> > > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > > index e70fccf..5cdb06f 100644
> > > --- a/hw/block/virtio-blk.c
> > > +++ b/hw/block/virtio-blk.c
> > > @@ -36,6 +36,7 @@ VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
> > > req->in_len = 0;
> > > req->next = NULL;
> > > req->mr_next = NULL;
> > > + req->merged = false;
> > > return req;
> > > }
> > >
> > > @@ -344,6 +345,7 @@ static inline void submit_requests(BlockBackend *blk,
> > > MultiReqBuffer *mrb,
> > > for (i = start + 1; i < start + num_reqs; i++) {
> > > qemu_iovec_concat(qiov, &mrb->reqs[i]->qiov, 0,
> > > mrb->reqs[i]->qiov.size);
> > > + mrb->reqs[i]->merged = true;
> > > mrb->reqs[i - 1]->mr_next = mrb->reqs[i];
> > > nb_sectors += mrb->reqs[i]->qiov.size / BDRV_SECTOR_SIZE;
> > > }
> > > @@ -511,6 +513,11 @@ void virtio_blk_handle_request(VirtIOBlockReq *req,
> > > MultiReqBuffer *mrb)
> > > - sizeof(struct virtio_blk_inhdr);
> > > iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
> > >
> > > + if (req->merged) {
> > > + /* Enough for restarting a (migrated) merged request, no need to
> > > + * actually submit I/O. */
> > > + return;
> > > + }
>
> This is not enough.
>
> There is a risk that the coalesced requests being restarted here are also
> merged, which will leak the requests originally merged into it.
>
> > > type = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
> > >
> > > /* VIRTIO_BLK_T_OUT defines the command direction.
> > > VIRTIO_BLK_T_BARRIER
> > > diff --git a/include/hw/virtio/virtio-blk.h
> > > b/include/hw/virtio/virtio-blk.h
> > > index 6bf5905..db4adf4 100644
> > > --- a/include/hw/virtio/virtio-blk.h
> > > +++ b/include/hw/virtio/virtio-blk.h
> > > @@ -70,6 +70,7 @@ typedef struct VirtIOBlockReq {
> > > size_t in_len;
> > > struct VirtIOBlockReq *next;
> > > struct VirtIOBlockReq *mr_next;
> > > + bool merged;
> > > BlockAcctCookie acct;
> > > } VirtIOBlockReq;
> >
> > I'm not sure if this patch truly fixes the bug:
> >
> > virtio_blk_rw_complete() doesn't do req->mr_next = NULL. There is a
> > potential double-free if resubmitting failed requests doesn't overwrite
> > themr_next field.
> >
> > This can be fixed by adding req->mr_next = NULL to the loop in
> > virtio_blk_rw_complete().
> >
> > Is that enough to solve the bug? I don't think adding a new field is
> > necessary.
> >
> > If not, please explain the double-free.
>
> The first free is the expected one when the coalesced request is completed.
> The second free is because virtio_blk_rw_complete was also called on requests
> who have "merged == true", which is a mistake in virtio_blk_dma_restart_bh.
>
> I don't think adding req->mr_next in virtio_blk_rw_complete in the I/O error
> path is right or helpful.
Question about your patch: How is the merged qiov retained across
restart? The merged qiov is freed the first time
virtio_blk_rw_complete() with an error gets called and you skip
processing merged requests in virtio_blk_handle_request().
I think the result of your patch is that only the head request is
resubmitted to the host OS. Non-head requests are not submitted to the
host OS. The requests all complete at the virtio-blk level but only the
head request actually gets data transferred.
Like I said, I think the fix is much simpler. Forget about adding a
merged field and just dissolve the relationship between merged requests
in virtio_blk_rw_complete() by setting req->mr_next = NULL in
virtio_blk_rw_complete()'s loop. The code was designed to work this way
because virtio_blk_dma_restart_bh() parses and submits requests again
from scratch at the virtio-blk level.
Stefan
signature.asc
Description: PGP signature