qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V4 10/14] migration: stop vm for cpr


From: Steven Sistare
Subject: Re: [PATCH V4 10/14] migration: stop vm for cpr
Date: Wed, 13 Mar 2024 10:18:14 -0400
User-agent: Mozilla Thunderbird

On 2/29/2024 8:28 PM, Peter Xu wrote:
On Thu, Feb 29, 2024 at 10:21:14AM -0500, Steven Sistare wrote:
On 2/25/2024 9:08 PM, Peter Xu wrote:
On Thu, Feb 22, 2024 at 09:28:36AM -0800, Steve Sistare wrote:
When migration for cpr is initiated, stop the vm and set state
RUN_STATE_FINISH_MIGRATE before ram is saved.  This eliminates the
possibility of ram and device state being out of sync, and guarantees
that a guest in the suspended state remains suspended, because qmp_cont
rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

cpr-reboot mode keeps changing behavior.

Could we declare it "experimental" until it's solid?  Maybe a patch to
document this?

Normally IMHO we shouldn't merge a feature if it's not complete, however
cpr-reboot is so special that the mode itself is already merged in 8.2
before I started to merge patches, and it keeps changing things.  I don't
know what else we can do here besides declaring it experimental and not
declare it a stable feature.

Hi Peter, the planned/committed functionality for cpr-reboot changed only once, 
in:
     migration: stop vm for cpr

Suspension to support vfio is an enhancement which adds to the basic 
functionality,
it does not change it.  This was planned all along, but submitted as a separate

If VFIO used to migrate without suspension and now it won't, it's a
behavior change?

VFIO could not cpr-reboot migrate without suspension. The existing vfio migration blockers applied to all modes:
  Error: 0000:3a:10.0: VFIO migration is not supported in kernel

Now, with suspension, it will.  An addition, not a change.

series to manage complexity, as I outlined in my qemu community presentation,
which I emailed you at the time.

Other "changes" that arose during review were just clarifications and 
explanations.

So, I don't think cpr-reboot deserves to be condemned to experimental limbo.

IMHO it's not about a feature being condemned, it's about a kindful
heads-up to the users that one needs to take risk on using it until it
becomes stable, it also makes developers easier because of no limitation on
behavior change.  If all the changes are landing, then there's no need for
such a patch.

If so, please propose the planned complete document patch. I had a feeling
that cpr is still not fully understood by even many developers on the list.
It'll be great that such document will contain all the details one needs to
know on the feature, etc. meaning of the name cpr-reboot (what is "cpr"),
when to use it, how to use it, how it differs from "normal" mode
(etc. lifted limitations on some devices that used to block migration?),
what is enforced (vm stop, suspension, etc.) and what is optionally offered
(VFIO, shared-mem, etc.), the expected behaviors, etc.

When send it, please copy relevant people (Alex & Cedric for VFIO, while
Markus could also be a good candidate considering the QMP involvement).

Submitted.

- Steve



reply via email to

[Prev in Thread] Current Thread [Next in Thread]