[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH v6 16/42] block: Flush all children in generic c
From: |
Max Reitz |
Subject: |
Re: [Qemu-block] [PATCH v6 16/42] block: Flush all children in generic code |
Date: |
Mon, 9 Sep 2019 16:15:59 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 |
On 09.09.19 12:01, Kevin Wolf wrote:
> Am 09.09.2019 um 10:31 hat Max Reitz geschrieben:
>> On 05.09.19 18:24, Kevin Wolf wrote:
>>> Am 12.08.2019 um 14:58 hat Max Reitz geschrieben:
>>>> On 10.08.19 17:36, Vladimir Sementsov-Ogievskiy wrote:
>>>>> 09.08.2019 19:13, Max Reitz wrote:
>>>>>> If the driver does not support .bdrv_co_flush() so bdrv_co_flush()
>>>>>> itself has to flush the children of the given node, it should not flush
>>>>>> just bs->file->bs, but in fact all children.
>>>>>>
>>>>>> In any case, the BLKDBG_EVENT() should be emitted on the primary child,
>>>>>> because that is where a blkdebug node would be if there is any.
>>>>>>
>>>>>> Suggested-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>>>>>> Signed-off-by: Max Reitz <address@hidden>
>>>>>> ---
>>>>>> block/io.c | 23 +++++++++++++++++------
>>>>>> 1 file changed, 17 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/block/io.c b/block/io.c
>>>>>> index c5a8e3e6a3..bcc770d336 100644
>>>>>> --- a/block/io.c
>>>>>> +++ b/block/io.c
>>>>>> @@ -2572,6 +2572,8 @@ static void coroutine_fn bdrv_flush_co_entry(void
>>>>>> *opaque)
>>>>>>
>>>>>> int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>>>>> {
>>>>>> + BdrvChild *primary_child = bdrv_primary_child(bs);
>>>>>> + BdrvChild *child;
>>>>>> int current_gen;
>>>>>> int ret = 0;
>>>>>>
>>>>>> @@ -2601,7 +2603,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState
>>>>>> *bs)
>>>>>> }
>>>>>>
>>>>>> /* Write back cached data to the OS even with cache=unsafe */
>>>>>> - BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS);
>>>>>> + BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_OS);
>>>>>> if (bs->drv->bdrv_co_flush_to_os) {
>>>>>> ret = bs->drv->bdrv_co_flush_to_os(bs);
>>>>>> if (ret < 0) {
>>>>>> @@ -2611,15 +2613,15 @@ int coroutine_fn bdrv_co_flush(BlockDriverState
>>>>>> *bs)
>>>>>>
>>>>>> /* But don't actually force it to the disk with cache=unsafe */
>>>>>> if (bs->open_flags & BDRV_O_NO_FLUSH) {
>>>>>> - goto flush_parent;
>>>>>> + goto flush_children;
>>>>>> }
>>>>>>
>>>>>> /* Check if we really need to flush anything */
>>>>>> if (bs->flushed_gen == current_gen) {
>>>>>> - goto flush_parent;
>>>>>> + goto flush_children;
>>>>>> }
>>>>>>
>>>>>> - BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK);
>>>>>> + BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_DISK);
>>>>>> if (!bs->drv) {
>>>>>> /* bs->drv->bdrv_co_flush() might have ejected the BDS
>>>>>> * (even in case of apparent success) */
>>>>>> @@ -2663,8 +2665,17 @@ int coroutine_fn bdrv_co_flush(BlockDriverState
>>>>>> *bs)
>>>>>> /* Now flush the underlying protocol. It will also have
>>>>>> BDRV_O_NO_FLUSH
>>>>>> * in the case of cache=unsafe, so there are no useless flushes.
>>>>>> */
>>>>>> -flush_parent:
>>>>>> - ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0;
>>>>>> +flush_children:
>>>>>> + ret = 0; > + QLIST_FOREACH(child, &bs->children, next) {
>>>>>> + int this_child_ret;
>>>>>> +
>>>>>> + this_child_ret = bdrv_co_flush(child->bs);
>>>>>> + if (!ret) {
>>>>>> + ret = this_child_ret;
>>>>>> + }
>>>>>> + }
>>>>>
>>>>> Hmm, you said that we want to flush only children with write-access from
>>>>> parent..
>>>>
>>>> Good that you remember it, I must have overlooked it (when reading the
>>>> replies to the previous version). :-)
>>>>
>>>>> Shouldn't we check it? Or we assume that it's always safe to call
>>>>> bdrv_co_flush on
>>>>> a node?
>>>>
>>>> I think it’s always safe. But checking it seems like a nice touch, yes.
>>>
>>> I'm not sure why we would unconditionally flush all children anyway. The
>>> only drivers I can think of that really need to flush more than one
>>> child are blkverify and quorum, and both of them already implement this.
>>> blkverify implements .bdrv_co_flush, so it's not affected by the change
>>> anyway, but quorum children will be flushed twice now.
>>>
>>> But more than this, I'm worried about the overhead of needlessly
>>> recursing through the whole backing chain and calling flush on every
>>> node there. Maybe bs->write_gen saves us so that at least this doesn't
>>> result in an fdatasync() call for each, but still... Without a use case,
>>> I'd rather not do this.
>>>
>>> Oh, well, after having written all of this, I see that qcow2 with an
>>> external data file is buggy... This could be fixed in the qcow2 driver,
>>> but maybe restricting the recursion to read-only is actually good enough
>>> then. Can you mention this case in the commit message and maybe build a
>>> test for it?
>>
>> And I should thus probably drop vmdk’s .bdrv_co_flush_to_disk()
>> implementation.
>>
>> I will indeed try to write a test, but to be completely honest, I feel
>> like this series is long enough.
>
> I guess I could already merge patch 1 to give you space for another test
> patch. ;-)
Don’t forget the mirror-top patch. And AFAIR, there was some comment
from Vladimir that also required an additional patch. So it would need
to be three!
(Or I just start squashing from the back?)
Max
signature.asc
Description: OpenPGP digital signature