On 3/15/24 12:01, Peter Xu wrote:
On Fri, Mar 15, 2024 at 11:17:45AM +0100, Cédric Le Goater wrote:
migrate_set_state is also unintuitive because it ignores invalid state
transitions and we've been using that property to deal with special
states such as POSTCOPY_PAUSED and FAILED:
- After the migration goes into POSTCOPY_PAUSED, the resumed migration's
migrate_init() will try to set the state NONE->SETUP, which is not
valid.
- After save_setup fails, the migration goes into FAILED, but wait_unplug
will try to transition SETUP->ACTIVE, which is also not valid.
I am not sure I understand what the plan is. Both solutions are problematic
regarding the state transitions.
Should we consider that waiting for failover devices to unplug is an internal
step of the SETUP phase not transitioning to ACTIVE ?
If to unblock this series, IIUC the simplest solution is to do what Fabiano
suggested, that we move qemu_savevm_wait_unplug() to be before the check of
setup() ret.
The simplest is IMHO moving qemu_savevm_wait_unplug() before
qemu_savevm_state_setup() and leave patch 10 is unchanged. See
below the extra patch. It looks much cleaner than what we have
today.
In that case, the state change in qemu_savevm_wait_unplug()
should be benign and we should see a super small window it became ACTIVE
but then it should be FAILED (and IIUC the patch itself will need to use
ACTIVE as "old_state", not SETUP anymore).
OK. I will give it a try to compare.