[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] block-backend: protect setting block root to NULL with block
From: |
Fiona Ebner |
Subject: |
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock |
Date: |
Fri, 10 Jan 2025 17:37:33 +0100 |
User-agent: |
Mozilla Thunderbird |
Am 09.01.25 um 11:47 schrieb Kevin Wolf:
> Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
>> Setting blk->root is a graph change operation and thus needs to be
>> protected by the block graph write lock in blk_remove_bs(). The
>> assignment to blk->root in blk_insert_bs() is already protected by
>> the block graph write lock.
>
> Hm, if that's the case, then we should also enforce this in the
> declaration of BlockBackend:
>
> BdrvChild * GRAPH_RDLOCK_PTR root;
>
> However, this results in more compiler failures that we need to fix. You
> caught the only remaining writer, but the lock is only fully effective
> if all readers take it, too.
I started giving this a try, but quickly ran into some issues/questions:
1. For global state code, is it preferred to use
GRAPH_RDLOCK_GUARD_MAINLOOP() to cover the whole function or better to
use bdrv_graph_rd(un)lock_main_loop() to keep the locked section as
small as necessary? I feel like the former can be more readable, e.g. in
blk_insert_bs(), blk_new_open(), where blk->root is used in conditionals.
2. In particular, protecting blk->root means that blk_bs() needs to have
the read lock. In fact, blk_bs() is reading blk->root twice in a row, so
it seems like it could suffer from a potential NULL pointer dereference
(or I guess after compiler optimization a potential use-after-free)?
Since blk_bs() is IO_CODE() and not a coroutine, I tried to mark it
GRAPH_RDLOCK and move on to the callers.
However, one caller is blk_nb_sectors() which itself is called by
blk_get_geometry(). Both of these are manually-written coroutine wrappers:
> commit 81f730d4d0e8af9c0211c3fedf406df0046341a9
> Author: Paolo Bonzini <pbonzini@redhat.com>
> Date: Fri Apr 7 17:33:03 2023 +0200
>
> block, block-backend: write some hot coroutine wrappers by hand
>
> The introduction of the graph lock is causing blk_get_geometry, a hot
> function
> used in the I/O path, to create a coroutine. However, the only part that
> really
> needs to run in coroutine context is the call to
> bdrv_co_refresh_total_sectors,
> which in turn only happens in the rare case of host CD-ROM devices.
>
> So, write by hand the three wrappers on the path from blk_co_get_geometry
> to
> bdrv_co_refresh_total_sectors, so that the coroutine wrapper is only
> created
> if bdrv_nb_sectors actually calls bdrv_refresh_total_sectors.
Both the blk_bs() and blk_nb_sectors() functions are IO_CODE(), but not
coroutines, and callers of blk_get_geometry are already in the device
code. I'm not sure how to proceed here, happy to hear suggestions :)
---snip---
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index c93a7525ad..9678615318 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
>> */
>> blk_drain(blk);
>> root = blk->root;
>> - blk->root = NULL;
>>
>> bdrv_graph_wrlock();
>> + blk->root = NULL;
>> bdrv_root_unref_child(root);
>> bdrv_graph_wrunlock();
>> }
>
> I think the 'root = blk->root' needs to be inside the locked section,
> too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
> has a nested event loop) and root would be stale. I assume clang would
> complain about this with the added GRAPH_RDLOCK_PTR.
Oh I see, good catch!
Best Regards,
Fiona
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock,
Fiona Ebner <=