qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC] block-backend: prevent dangling BDS pointer in blk_drain()


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [RFC] block-backend: prevent dangling BDS pointer in blk_drain()
Date: Thu, 9 Dec 2021 19:09:18 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0

09.12.2021 18:45, Hanna Reitz wrote:
On 09.12.21 15:23, Stefan Hajnoczi wrote:
The BlockBackend root child can change during bdrv_drained_begin() when
aio_poll() is invoked. In fact the BlockDriverState can reach refcnt 0
and blk_drain() is left with a dangling BDS pointer.

One example is scsi_device_purge_requests(), which calls blk_drain() to
wait for in-flight requests to cancel. If the backup blockjob is active,
then the BlockBackend root child is a temporary filter BDS owned by the
blockjob. The blockjob can complete during bdrv_drained_begin() and the
last reference to the BDS is released when the temporary filter node is
removed. This results in a use-after-free when blk_drain() calls
bdrv_drained_end(bs) on the dangling pointer.

The general problem is that a function and its callers must not assume
that bs is still valid across aio_poll(). Explicitly hold a reference to
bs in blk_drain() to avoid the dangling pointer.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
I found that BDS nodes are sometimes deleted with bs->quiesce_counter >
0 (at least when running "make check"), so it is currently not possible
to put the bdrv_ref/unref() calls in bdrv_do_drained_begin() and
bdrv_do_drained_end() because they will be unbalanced. That would have
been a more general solution than only fixing blk_drain().

Deleting nodes that have a `quiesce_counter > 0` doesn’t seem wrong to me – 
deleting only depends on strong references, and so I’d expect that anything that 
increases the quiesce_counter also has a strong reference to the node if the 
former wants the latter to stay around.

I suppose we could make it so that both the quiesce_counter and the refcnt need 
to be 0 before a BDS is deleted (and then deletion can happen both from 
bdrv_unref() and drained_end), but I don’t know whether that’s really 
necessary.  I’d rather leave it to the caller to ensure they keep a strong 
reference throughout the drain.

Agree. Better to keep the ref-count behavior obvious.


The question is, how often do we have a situation like this, where we take a weak 
reference for draining, because we assume there’s a strong reference backing us up 
(namely the one through blk->root), but that strong reference then can go away 
due to draining...

Any suggestions for a better fix?

The fix makes sense to me.

One alternative that comes to my mind is to instead re-fetch `bs = 
blk_bs(blk);` after the AIO_WAIT_WHILE() loop.  But that might be wrong, 
because if the node attached to the BB changed (i.e. isn’t `bs`, and isn’t 
`NULL`), then we’d end the drain on the wrong node.

So I think your fix is the right one.


I agree.

Interesting how many code paths that care to take a strong reference are 
actually prepared to the fact that block-graph may change, and this bs may be 
in some other place, with changed permissions, children and parents :/

Is graph modifying during drain is safe? Hmm, we probably always do graph 
modification in drained section for purpose) As I understand, all the logic 
about quiesce_counter is to support exactly this. And the only logic that seems 
correct is to finish drain on same node where it was started.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]