qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [External] Re: [PATCH v3 6/7] migration/multifd: Add zero pages and


From: Markus Armbruster
Subject: Re: [External] Re: [PATCH v3 6/7] migration/multifd: Add zero pages and zero bytes counter to migration status interface.
Date: Fri, 01 Mar 2024 06:53:33 +0100
User-agent: Gnus/5.13 (Gnus v5.13)

Hao Xiang <hao.xiang@bytedance.com> writes:

> On Wed, Feb 28, 2024 at 10:01 PM Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Hao Xiang <hao.xiang@bytedance.com> writes:
>>
>> > On Wed, Feb 28, 2024 at 1:52 AM Markus Armbruster <armbru@redhat.com> 
>> > wrote:
>> >>
>> >> Hao Xiang <hao.xiang@bytedance.com> writes:
>> >>
>> >> > This change extends the MigrationStatus interface to track zero pages
>> >> > and zero bytes counter.
>> >> >
>> >> > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
>> >>
>> >> [...]
>> >>
>> >> > diff --git a/qapi/migration.json b/qapi/migration.json
>> >> > index a0a85a0312..171734c07e 100644
>> >> > --- a/qapi/migration.json
>> >> > +++ b/qapi/migration.json
>> >> > @@ -63,6 +63,10 @@
>> >> >  #     between 0 and @dirty-sync-count * @multifd-channels.  (since
>> >> >  #     7.1)
>> >> >  #
>> >> > +# @zero-pages: number of zero pages (since 9.0)
>> >> > +#
>> >> > +# @zero-bytes: number of zero bytes sent (since 9.0)
>> >> > +#
>> >>
>> >> Awfully terse.  How are these two related?
>> >
>> > Sorry I forgot to address the same feedback from the last version.
>>
>> Happens :)
>>
>> > zero-pages are the number of pages being detected as all "zero" and
>> > hence the payload isn't sent over the network. zero-bytes is basically
>> > zero-pages * page_size. It's the number of bytes migrated (but not
>> > actually sent through the network) because they are all "zero". These
>> > two are related to the existing interface below. normal and
>> > normal-bytes are the same representation of pages who are not all
>> > "zero" and are actually sent through the network.
>> >
>> > # @normal: number of normal pages (since 1.2)
>> > #
>> > # @normal-bytes: number of normal bytes sent (since 1.2)
>>
>> We also have
>>
>>   # @duplicate: number of duplicate (zero) pages (since 1.2)
>>   #
>>   # @skipped: number of skipped zero pages. Always zero, only provided for
>>   #     compatibility (since 1.5)
>>
>> Page skipping was introduced in 1.5, and withdrawn in 1.5.3 and 1.6.
>> @skipped was formally deprecated in 8.1.  It'll soon be gone, no need to
>> worry about it now.
>>
>> That leaves three values related to pages sent: @normal (and
>> @normal-bytes), @duplicate (but no @duplicate-bytes), and @zero-pages
>> (and @zero-bytes).
>>
>> I unwittingly created a naming inconsistency between @normal,
>> @duplicate, and @zero-pages when I asked you to rename @zero to
>> @zero-pages.
>>
>> The meaning of the three values is not obvious, and the doc comments
>> don't explain them.  Can you, or anybody familiar with migration,
>> explain them to me?
>>
>> MigrationStats return some values as bytes, some as pages, and some as
>> both.  I hate that.  Can we standardize on bytes?
>
> I added zero/zero-bytes because I thought they were not there. But it
> turns out "duplicate" is for that purpose. "zero/zero-bytes" is really
> additional information to "normal/normal-bytes". Peter suggested that
> if we add "zero/zero-bytes" we can slowly retire "duplicate" at a
> later point.

"zero" is a better name than "duplicate".  Identical non-zero pages are
possible, and they are duplicates, too.

If you add @zero with the intent to replace @duplicate, you should
immediately deprecate @duplicate.  If you need assistance with that,
just ask.

> I don't know the historical reason why pages/bytes are used the way it
> is today. The way I understand migration, the granularity of ram
> migration is "page". There are only two types of pages 1) normal 2)
> zero. Zero pages' playload are not sent through the network because we
> already know what it looks like. Only the page offset is sent. Normal
> pages are pages that are not zero. The entire page is sent through the
> network to the target host.

This is not at all clear from the documentation of MigrationStats.  I
think the documentation needs improvement there.

>                             if a user knows the zero/normal count,
> they can already calculate the zero-bytes/normal-bytes (zero/normal *
> page size)

Yes, because member @page-size tells them the multiplier.

>            but it's just convenient to see both. During development, I
> check on these counters a lot and they are useful.

QMP is for machines.  Machines don't need or want the same quantity in
two units.  Providing them both bytes and pages is a design mistake.
Whether it's worth correcting now is of course debatable.

Regardless, the fact @normal-bytes = @normal * @page-size needs to be
documented.  We have

    # @page-size: The number of bytes per page for the various page-based
    #     statistics (since 2.10)

The fact that I inquired how zero-pages and zero-bytes are related might
indicate that this isn't quite clear enough.

[...]




reply via email to

[Prev in Thread] Current Thread [Next in Thread]