qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] block-backend: protect setting block root to NULL with block


From: Daniel Wielandt
Subject: Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Date: Thu, 9 Jan 2025 04:52:19 -0600

Suggestions hold pattern.. programmers obviously understand control logic. Systems built with operational flaws. Will freeze up when u start repairing it while it's running.


On Thu, Jan 9, 2025, 4:48 AM Kevin Wolf <kwolf@redhat.com> wrote:
Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
> Setting blk->root is a graph change operation and thus needs to be
> protected by the block graph write lock in blk_remove_bs(). The
> assignment to blk->root in blk_insert_bs() is already protected by
> the block graph write lock.

Hm, if that's the case, then we should also enforce this in the
declaration of BlockBackend:

    BdrvChild * GRAPH_RDLOCK_PTR root;

However, this results in more compiler failures that we need to fix. You
caught the only remaining writer, but the lock is only fully effective
if all readers take it, too.

> In particular, the graph read lock in blk_co_do_flush() could
> previously not ensure that blk_bs(blk) would always return the same
> value during the locked section, which could lead to a segfault [0] in
> combination with migration [1].
>
> From the user-provided backtraces in the forum thread [1], it seems
> like blk_co_do_flush() managed to get past the
> blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
> non-NULL value during the check, but then, when calling
> bdrv_co_flush(), blk_bs(blk) returned NULL.
>
> [0]:
>
> > 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
> > 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
> > 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at block/block-gen.c:901
>
> [1]: https://forum.proxmox.com/threads/158072
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  block/block-backend.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/block-backend.c b/block/block-backend.c
> index c93a7525ad..9678615318 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
>       */
>      blk_drain(blk);
>      root = blk->root;
> -    blk->root = NULL;

>      bdrv_graph_wrlock();
> +    blk->root = NULL;
>      bdrv_root_unref_child(root);
>      bdrv_graph_wrunlock();
>  }

I think the 'root = blk->root' needs to be inside the locked section,
too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
has a nested event loop) and root would be stale. I assume clang would
complain about this with the added GRAPH_RDLOCK_PTR.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]