[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC] block-backend: prevent dangling BDS pointer in blk_drain()
From: |
Kevin Wolf |
Subject: |
Re: [RFC] block-backend: prevent dangling BDS pointer in blk_drain() |
Date: |
Fri, 10 Dec 2021 15:00:38 +0100 |
Am 09.12.2021 um 15:23 hat Stefan Hajnoczi geschrieben:
> The BlockBackend root child can change during bdrv_drained_begin() when
> aio_poll() is invoked. In fact the BlockDriverState can reach refcnt 0
> and blk_drain() is left with a dangling BDS pointer.
>
> One example is scsi_device_purge_requests(), which calls blk_drain() to
> wait for in-flight requests to cancel. If the backup blockjob is active,
> then the BlockBackend root child is a temporary filter BDS owned by the
> blockjob. The blockjob can complete during bdrv_drained_begin() and the
> last reference to the BDS is released when the temporary filter node is
> removed. This results in a use-after-free when blk_drain() calls
> bdrv_drained_end(bs) on the dangling pointer.
>
> The general problem is that a function and its callers must not assume
> that bs is still valid across aio_poll(). Explicitly hold a reference to
> bs in blk_drain() to avoid the dangling pointer.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> I found that BDS nodes are sometimes deleted with bs->quiesce_counter >
> 0 (at least when running "make check"), so it is currently not possible
> to put the bdrv_ref/unref() calls in bdrv_do_drained_begin() and
> bdrv_do_drained_end() because they will be unbalanced. That would have
> been a more general solution than only fixing blk_drain().
They are not supposed to end up unbalanced because detaching a child
calls bdrv_unapply_subtree_drain(). In fact, I think test-bdrv-drain
tests a few scenarios like this.
Do have more details about the case that failed for you?
Kevin