[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 03/16] migration: Move setup_time to mig_stats
From: |
David Edmondson |
Subject: |
Re: [PATCH v2 03/16] migration: Move setup_time to mig_stats |
Date: |
Tue, 16 May 2023 12:07:32 +0100 |
Juan Quintela <quintela@redhat.com> writes:
> David Edmondson <david.edmondson@oracle.com> wrote:
>> Juan Quintela <quintela@redhat.com> writes:
>>
>>> It is a time that needs to be cleaned each time cancel migration.
>>> Once there create migration_time_since() to calculate how time since a
>>> time in the past.
>>>
>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>>
>>> ---
>>>
>>> Rename to migration_time_since (cédric)
>>> ---
>>> migration/migration-stats.h | 13 +++++++++++++
>>> migration/migration.h | 1 -
>>> migration/migration-stats.c | 7 +++++++
>>> migration/migration.c | 9 ++++-----
>>> 4 files changed, 24 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/migration/migration-stats.h b/migration/migration-stats.h
>>> index e782f1b0df..21402af9e4 100644
>>> --- a/migration/migration-stats.h
>>> +++ b/migration/migration-stats.h
>>> @@ -75,6 +75,10 @@ typedef struct {
>>> * Number of bytes sent during precopy stage.
>>> */
>>> Stat64 precopy_bytes;
>>> + /*
>>> + * How long has the setup stage took.
>>> + */
>>> + Stat64 setup_time;
>>
>> Is this really a Stat64? It doesn't appear to need the atomic update
>> feature.
>
> What this whole Migration Atomic Counters series try to do is that
> everything becomes atomic and then we can use everything everywhere.
>
> Before this series we had (I am simplifying here):
>
> - transferred, precopy_bytes, postcopy_bytes, downtime_bytes -> atomic,
> you can use it anywhere
>
> - qemu_file transferred -> you can only use it from the main migration
> thread
>
> - qemu_file rate_limit -> you can only use it from the main migration
> thread
>
> And we had to update the three counters in every place that we did a
> write wehad to update all of them.
>
> You can see the contorsions that we go to to update the rate_limit and
> the qemu_file transferred fields.
>
> After the series (you need to get what it is already on the tree, this
> series, QEMUFileHooks cleanup, and another serie on my tree waiting for
> this to be commited), you got three counters:
>
> - qemu_file: atomic, everytime we do a qemu_file write we update it
> - multifd_bytes: atomic, everytime that we do a write in a multifd
> channel, we update it.
> - rdma_bytes: atomic, everytime we do a write through RDMA we update it.
>
> And that is it.
>
> Both rate_limit and transferred are derived from these three counters:
>
> - at any point in time migration_transferred_bytes() returns the amount
> of bytes written since the start of the migration:
> qemu_file_bytes + multifd_bytes + rdma_bytes.
>
> - transferred on this period:
> at_start_of_period = migration_transferred_bytes().
> trasferred_in_this_period = migration_transferred_bytes() -
> at_start_of_period;
>
> - Similar for precopy_bytes, postcopy_bytes and downtime_bytes. When we
> move from one stage to the next, we store what is the value of the
> previous stage.
>
> The counters that we use to calculate the rate limit are updated around
> 10 times per second (can be a bit bigger at the end of periods,
> iterations, ...) So performance is not extra critical.
>
> But as we have way less atomic operations (really one per real write),
> we don't really care a lot if we do some atomic operations when a normal
> operation will do.
>
> I.e. I think we have two options:
>
> - have the remaining counters that are only used in the main migration
> thread not be atomic. Document them and remember to do the correct
> thing everytime we use it. If we need to use it in another thread,
> just change it to atomic.
>
> - Make all counters atomic. No need to document anything. And you can
> call any operation/counter/... in migration-stats.c from anywhere.
>
> I think that the second option is better. But I can hear reasons from
> people that think that the 1st one is better.
For the counters, no argument - making them all atomic seems like the
right way forward.
start_time isn't a counter, and isn't manipulated at multiple points in
the code by different actors.
I don't hate it being a Stat64, it just seems odd when the other 'time'
related variables are not.
> Comments?
>
> Later, Juan.
--
You can't hide from the flipside.
- Re: [PATCH v2 01/16] migration: Don't use INT64_MAX for unlimited rate, (continued)
[PATCH v2 03/16] migration: Move setup_time to mig_stats, Juan Quintela, 2023/05/15
Re: [PATCH v2 03/16] migration: Move setup_time to mig_stats, Leonardo Brás, 2023/05/24
[PATCH v2 05/16] migration: Move rate_limit_max and rate_limit_used to migration_stats, Juan Quintela, 2023/05/15
[PATCH v2 04/16] qemu-file: Account for rate_limit usage on qemu_fflush(), Juan Quintela, 2023/05/15