qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] block-backend: protect setting block root to NULL with block


From: Daniel Wielandt
Subject: Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Date: Thu, 9 Jan 2025 04:55:47 -0600

I feel like your too focused on new problems when you  gave up repairing to old os.. be ause at one time it was real slick.. now.theres parts that are problems that can be eliminated easy and provide  a less bloated function with simple code deletion.. fix the original is down its it's primary parts
.then u can add your fancy automation and setting get real com0lucated


On Thu, Jan 9, 2025, 4:52 AM Daniel Wielandt <wielandtdan@gmail.com> wrote:

Suggestions hold pattern.. programmers obviously understand control logic. Systems built with operational flaws. Will freeze up when u start repairing it while it's running.


On Thu, Jan 9, 2025, 4:48 AM Kevin Wolf <kwolf@redhat.com> wrote:
Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
> Setting blk->root is a graph change operation and thus needs to be
> protected by the block graph write lock in blk_remove_bs(). The
> assignment to blk->root in blk_insert_bs() is already protected by
> the block graph write lock.

Hm, if that's the case, then we should also enforce this in the
declaration of BlockBackend:

    BdrvChild * GRAPH_RDLOCK_PTR root;

However, this results in more compiler failures that we need to fix. You
caught the only remaining writer, but the lock is only fully effective
if all readers take it, too.

> In particular, the graph read lock in blk_co_do_flush() could
> previously not ensure that blk_bs(blk) would always return the same
> value during the locked section, which could lead to a segfault [0] in
> combination with migration [1].
>
> From the user-provided backtraces in the forum thread [1], it seems
> like blk_co_do_flush() managed to get past the
> blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
> non-NULL value during the check, but then, when calling
> bdrv_co_flush(), blk_bs(blk) returned NULL.
>
> [0]:
>
> > 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
> > 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
> > 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at block/block-gen.c:901
>
> [1]: https://forum.proxmox.com/threads/158072
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  block/block-backend.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/block-backend.c b/block/block-backend.c
> index c93a7525ad..9678615318 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
>       */
>      blk_drain(blk);
>      root = blk->root;
> -    blk->root = NULL;

>      bdrv_graph_wrlock();
> +    blk->root = NULL;
>      bdrv_root_unref_child(root);
>      bdrv_graph_wrunlock();
>  }

I think the 'root = blk->root' needs to be inside the locked section,
too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
has a nested event loop) and root would be stale. I assume clang would
complain about this with the added GRAPH_RDLOCK_PTR.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]