[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit
From: |
Peter Xu |
Subject: |
Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded |
Date: |
Tue, 25 Jun 2024 15:03:56 -0400 |
On Tue, Jun 25, 2024 at 05:31:19PM +0100, Joao Martins wrote:
> The device-state multifd scaling is a take on improving switchover phase,
> and we will keep improving it whenever we find things... but the
That'll be helpful, thanks. Just a quick note that "reducing downtime" is
a separate issue comparing to "make downtime_limit accurate".
> switchover itself can't be 'precomputed' into a downtime number equation
> ahead of time to encompass all possible latencies/costs. Part of the
> reason that at least we couldn't think of a way besides this proposal
> here, which at the core it's meant to bounds check switchover. Even
> without taking into account VFs/HW[0], it is simply not considered how
> long it might take and giving some sort of downtime buffer coupled with
> enforcement that can be enforced helps not violating migration SLAs.
I agree such enforcement alone can be useful in general to be able to
fallback. Said that, I think it would definitely be nice to attach more
information on the downtime analysis when reposting this series, if there
is any.
For example, irrelevant of whether QEMU can do proper predictions at all,
there can be data / results to show what is the major parts that are
missing besides the current calculations, aka an expectation on when the
fallback can trigger, and some justification on why they can't be
predicted.
IMHO the enforcement won't make much sense if it keeps triggering, in that
case people will simply not use it as it stops migrations from happening.
Ultimately the work will still be needed to make downtime_limit accurate.
The fallback should only be an last fence to guard the promise which should
be the "corner cases".
--
Peter Xu
- [PATCH RFC 0/2] migration: introduce strict SLA, Elena Ufimtseva, 2024/06/21
- [PATCH RFC 1/2] migration: abort when switchover limit exceeded, Elena Ufimtseva, 2024/06/21
- [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Elena Ufimtseva, 2024/06/21
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Peter Xu, 2024/06/24
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Joao Martins, 2024/06/25
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Peter Xu, 2024/06/25
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Joao Martins, 2024/06/25
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded,
Peter Xu <=
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Joao Martins, 2024/06/26
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Peter Xu, 2024/06/26
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Daniel P . Berrangé, 2024/06/25
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Joao Martins, 2024/06/26
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Daniel P . Berrangé, 2024/06/26
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Joao Martins, 2024/06/26
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Daniel P . Berrangé, 2024/06/26
- Re: [PATCH RFC 2/2] migration: abort on destination if switchover limit exceeded, Daniel P . Berrangé, 2024/06/25