[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] bitmaps branch rebase
From: |
John Snow |
Subject: |
Re: [Qemu-devel] bitmaps branch rebase |
Date: |
Wed, 14 Aug 2019 15:03:31 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 |
On 8/14/19 9:26 AM, Vladimir Sementsov-Ogievskiy wrote:
> 08.08.2019 0:45, John Snow wrote:
>> FYI: I rebased jsnow/bitmaps on top of kwolf/block-next, itself based on
>> top of v4.1.0-rc4.
>>
>> I'll post this along with the eventual pull request, but here's the
>> diffstat against the published patches:
>>
>> 011/33:[0003] [FC] 'block/backup: upgrade copy_bitmap to BdrvDirtyBitmap'
>> 016/33:[----] [-C] 'iotests: Add virtio-scsi device helper'
>> 017/33:[0002] [FC] 'iotests: add test 257 for bitmap-mode backups'
>> 030/33:[0001] [FC] 'block/backup: teach TOP to never copy unallocated
>> regions'
>> 032/33:[0018] [FC] 'iotests/257: test traditional sync modes'
>>
>> 11: A new hbitmap call was added upstream, changed to
>> bdrv_dirty_bitmap_next_zero.
>> 16: Context-only (self.has_quit is new context in 040)
>> 17: Removed 'auto' to follow upstream trends in iotest fashion
>> 30: Remove ret = -ECANCELED as agreed on-list;
>> Context changes for dirty_end patches
>> 32: Fix capitalization in test, as mentioned on list.
>>
>> I think the changes are actually fairly minimal and translate fairly
>> directly; let's review the rebase on-list in response to the PULL mails
>> when I send them.
>>
>
>
> There is a bug in "block/backup: teach TOP to never copy unallocated regions":
>
>
> > @@ -256,6 +287,15 @@ static int coroutine_fn backup_do_cow(BackupBlockJob
> *job,
> > continue; /* already copied */
> > }
> >
> > + if (job->initializing_bitmap) {
> > + ret = backup_bitmap_reset_unallocated(job, start,
> &skip_bytes);
> > + if (ret == 0) {
> > + trace_backup_do_cow_skip_range(job, start, skip_bytes);
> > + start += skip_bytes;
> > + continue;
> > + }
>
> assume ret == 1, so we see skip_bytes of allocated bytes
>
> > + }
> > +
> > dirty_end = bdrv_dirty_bitmap_next_zero(job->copy_bitmap, start,
> > (end - start));
> > if (dirty_end < 0) {
> >
>
> but then, we may copy more than skip_bytes, i.e. touch following possibly
> unallocated area.
>
Yes, Max pointed this out to me. He fixed it in his rebase. I will
probably send his fix as a patch, but then squash it in.
> ===
>
> Also, if want to fix it anyway, I think it's better to make additional while
> loop before this one
> and reset all unallocated from start to end, otherwise we may call
> block_status for every cluster
> on each loop iteration, even if the first call returns skip_bytes >= (end -
> start).
>
Are you worried about the case where backup_bitmap_reset_unallocated
returns as soon as it knows at least one cluster is unallocated, and
thus might re-query regions in the unaligned "tails"?
(That is, if it finds 0 - 92K unallocated, so it confirms [0-64K] as
unallocated, but then needs to spend time re-querying for [64-128K].)
If you'd like to optimize this, I'll invite you to, as a patch.
In practice I wonder if you're often going to run into the case where
block_status wants to return to you information segmented to less than
the cluster size such that you'd be spending a considerable portion of
time re-querying.
I suppose if the job size was e.g. 128K but the native qcow2 size was
64K and you had very perfectly fragmented allocations that you might see
the worst case for re-querying regions.