[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 5/6] test-bdrv-drain.c: remove test_detach_by_parent_cb()
From: |
Emanuele Giuseppe Esposito |
Subject: |
Re: [PATCH 5/6] test-bdrv-drain.c: remove test_detach_by_parent_cb() |
Date: |
Mon, 14 Feb 2022 12:11:55 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 |
On 11/02/2022 16:38, Kevin Wolf wrote:
> Am 08.02.2022 um 16:36 hat Emanuele Giuseppe Esposito geschrieben:
>> This test uses a callback of an I/O function (blk_aio_preadv)
>> to modify the graph, using bdrv_attach_child.
>> This is simply not allowed anymore. I/O cannot change the graph.
>
> The callback of an I/O function isn't I/O, though. It is code _after_
> the I/O has completed. If this doesn't work any more, it feels like this
> is a bug.
>
> I think it becomes a bit more obvious when you translate the AIO into
> the equivalent coroutine function:
>
> void coroutine_fn foo(...)
> {
> GLOBAL_STATE_CODE();
>
> blk_co_preadv(...);
> detach_by_parent_aio_cb(...);
> }
>
> This is obviously correct code that must work. Calling an I/O function
> from a GS function is allowed, and of course the GS function may
> continue to do GS stuff after the I/O function returned.
>
> (Actually, I'm not sure if it really works when blk is not in the main
> AioContext, but your API split patches claim that it does, so let's
> assume for the moment that this is already true :-))
Uhmm why does my split claims that it is? all blk_aio_* are IO_CODE.
Also, as you said, if blk is not in the main loop, the callback is not
GS either.
>
>> Before "block/io.c: make bdrv_do_drained_begin_quiesce static
>> and introduce bdrv_drained_begin_no_poll", the test would simply
>> be at risk of failure, because if bdrv_replace_child_noperm()
>> (called to modify the graph) would call a drain,
>> then one callback of .drained_begin() is bdrv_do_drained_begin_quiesce,
>> that specifically asserts that we are not in a coroutine.
>>
>> Now that we fixed the behavior, the drain will invoke a bh in the
>> main loop, so we don't have such problem. However, this test is still
>> illegal and fails because we forbid graph changes from I/O paths.
>>
>> Once we add the required subtree_drains to protect
>> bdrv_replace_child_noperm(), the key problem in this test is in:
>
> Probably a question for a different patch, but why is a subtree drain
> required instead of just a normal node drain? It feels like a bigger
> hammer than what is needed for replacing a single child.
>
>> acb = blk_aio_preadv(blk, 0, &qiov, 0, detach_by_parent_aio_cb, NULL);
>> /* Drain and check the expected result */
>> bdrv_subtree_drained_begin(parent_b);
>>
>> because the detach_by_parent_aio_cb calls detach_indirect_bh(), that
>> modifies the graph and is invoked during bdrv_subtree_drained_begin().
>> The call stack is the following:
>> 1. blk_aio_preadv() creates a coroutine, increments in_flight counter
>> and enters the coroutine running blk_aio_read_entry()
>> 2. blk_aio_read_entry() performs the read and then schedules a bh to
>> complete (blk_aio_complete)
>> 3. at this point, subtree_drained_begin() kicks in and waits for all
>> in_flight requests, polling
>> 4. polling allows the bh to be scheduled, so blk_aio_complete runs
>> 5. blk_aio_complete *first* invokes the callback
>> (detach_by_parent_aio_cb) and then decrements the in_flight counter
>> 6. Here we have the problem: detach_by_parent_aio_cb modifies the graph,
>> so both bdrv_unref_child() and bdrv_attach_child() will have
>> subtree_drains inside. And this causes a deadlock, because the
>> nested drain will wait for in_flight counter to go to zero, which
>> is only happening once the drain itself finishes.
>
> So the problem has nothing to do with detach_by_parent_aio_cb() being
> an I/O function in the sense of locking rules (which it isn't), but with
> calling a drain operation while having the in_flight counter increased.
>
> In other words, an AIO callback like detach_by_parent_aio_cb() must
> never call drain - and it doesn't before your changes to
> bdrv_replace_child_noperm() break it. How confident are we that this
> only breaks tests and not real code?
I am not sure. From a quick look, the AIO callback is really used pretty
much everywhere. Maybe we should really find a way to asseert
GLOBAL_STATE_CODE and friends?
>
> Part of the problem is probably that drain is serving two slightly
> different purposes inside the block layer (just make sure that nothing
> touches the node any more) and callers outside of the block layer (make
> sure that everything has completed and no more callbacks will be
> called). This bug sits exactly in the difference between those two
> concepts - we're waiting for more than we would have to wait for, and it
> causes a deadlock in this combination.
>
> I guess it could be fixed if BlockBackend accounted for requests that
> are already completed, but their callback hasn't yet. blk_drain() would
> then have to wait for those requests, but blk_root_drained_poll()
> wouldn't because these requests don't affect the root node any more.
>
>> Different story is test_detach_by_driver_cb(): in this case,
>> detach_by_parent_aio_cb() does not call detach_indirect_bh(),
>> but it is instead called as a bh running in the main loop by
>> detach_by_driver_cb_drained_begin(), the callback for
>> .drained_begin().
>>
>> This test was added in 231281ab42 and part of the series
>> "Drain fixes and cleanups, part 3"
>> https://lists.nongnu.org/archive/html/qemu-block/2018-05/msg01132.html
>> but as explained above I believe that it is not valid anymore, and
>> can be discarded.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>
> I feel throwing tests away just because they don't pass any more is a
> bit too simple. :-)
Well if the test is doing something it is not supposed to do, why not :)
And anyways this was obviously discussed in IRC with Paolo and somebody
else, can't remember who.
Emanuele
>
> Kevin
>
- [PATCH 6/6] jobs: ensure sleep in job_sleep_ns is fully performed, (continued)