[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v1] migration: refactor migration_completion
From: |
Peter Xu |
Subject: |
Re: [PATCH v1] migration: refactor migration_completion |
Date: |
Wed, 26 Jul 2023 13:10:06 -0400 |
On Fri, Jul 21, 2023 at 11:14:55AM +0000, Wang, Wei W wrote:
> On Friday, July 21, 2023 4:38 AM, Peter Xu wrote:
> > Looks good to me, after addressing Isaku's comments.
> >
> > The current_active_state is very unfortunate, along with most of the calls
> > to
> > migrate_set_state() - I bet most of the code will definitely go wrong if
> > that
> > cmpxchg didn't succeed inside of migrate_set_state(), IOW in most cases we
> > simply always want:
>
> Can you share examples where it could be wrong?
> (If it has bugs, we need to fix)
Nop. What I meant is most of the cases we want to set the state without
caring much about the old state, so at least we can have a helper like
below and simply call migrate_set_state(s, STATE) where we don't care old
state.
>
> >
> > migrate_set_state(&s->state, s->state, XXX);
> >
> > Not sure whether one pre-requisite patch is good to have so we can rename
> > migrate_set_state() to something like __migrate_set_state(), then:
> >
> > migrate_set_state(s, XXX) {
> > __migrate_set_state(&s->state, s->state, XXX);
> > }
> >
> > I don't even know whether there's any call site that will need
> > __migrate_set_state() for real..
> >
>
> Seems this would break the use of "MIGRATION_STATUS_CANCELLING".
> For example,
> - In migration_maybe_pause:
> migrate_set_state(&s->state, MIGRATION_STATUS_PRE_SWITCHOVER,
> new_state);
> If the current s->state isn't MIGRATION_STATUS_PRE_SWITCHOVER (could
> be MIGRATION_STATUS_CANCELLING), then s->state won’t be updated to
> new_state.
> - Then, in migration_completion, the following update to s->state won't
> succeed:
> migrate_set_state(&s->state, current_active_state,
> MIGRATION_STATUS_COMPLETED);
>
> - Finally, when reaching migration_iteration_finish(), s->state is
> MIGRATION_STATUS_CANCELLING, instead of MIGRATION_STATUS_COMPLETED.
The whole state changes are just flaky to me in general, even with the help
of old_state cmpxchg.
E.g., I'm wondering whether below race can happen, assuming we're starting
with ACTIVE state and just about to complete migration:
main thread migration thread
----------- ----------------
migration_maybe_pause(current_active_state==ACTIVE)
if (s->state !=
MIGRATION_STATUS_CANCELLING)
--> true, keep setting state
qemu_mutex_unlock_iothread();
qemu_mutex_lock_iothread();
migrate_fd_cancel()
if (old_state == MIGRATION_STATUS_PRE_SWITCHOVER)
--> false, not posting to pause_sem
set state to MIGRATION_STATUS_CANCELLING
migrate_set_state(&s->state,
*current_active_state,
MIGRATION_STATUS_PRE_SWITCHOVER);
--> false, cmpxchg fail
qemu_sem_wait(&s->pause_sem);
--> hang death?
Thanks,
--
Peter Xu