qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] Fix race in live migration failure path


From: Peter Xu
Subject: Re: [RFC PATCH] Fix race in live migration failure path
Date: Thu, 23 Jan 2025 15:40:25 -0500

On Thu, Jan 23, 2025 at 06:05:28PM +0000, Shivam Kumar wrote:
> 
> 
> > On 23 Jan 2025, at 9:57 PM, Peter Xu <peterx@redhat.com> wrote:
> > 
> > !-------------------------------------------------------------------|
> >  CAUTION: External Email
> > 
> > |-------------------------------------------------------------------!
> > 
> > On Thu, Jan 23, 2025 at 09:53:16AM +0000, Shivam Kumar wrote:
> >> 
> >> 
> >>> On 22 Jan 2025, at 10:10 PM, Peter Xu <peterx@redhat.com> wrote:
> >>> 
> >>> !-------------------------------------------------------------------|
> >>> CAUTION: External Email
> >>> 
> >>> |-------------------------------------------------------------------!
> >>> 
> >>> Hi, Shivam,
> >>> 
> >>> On Wed, Jan 22, 2025 at 10:54:17AM +0000, Shivam Kumar wrote:
> >>>> There is one place where we set the migration status to FAILED but we 
> >>>> don’t set
> >>>> s->error, i.e. in save_snapshot workflow, please check qemu_savevm_state 
> >>>> but
> >>>> not sure setting s->error in this case is possible (or required), as it 
> >>>> seems a
> >>>> different workflow to me.
> >>> 
> >>> Yes it's very different indeed.  I may not yet fully get the challenge on
> >>> how switching to s->error (implies FAILING) would affect this one, though.
> >>> I mean, in qemu_savevm_state() it tries to fetch qemufile errors with:
> >>> 
> >>>   ret = qemu_file_get_error(f);
> >>> 
> >>> IIUC we could also try to fetch s->error just like what migration thread
> >>> does, and if it sets means it's failing?
> >> Yes, I can just set s->error in qemu_savevm_state.
> >> @@ -1686,7 +1686,6 @@ static int qemu_savevm_state(QEMUFile *f, Error 
> >> **errp)
> >> -        status = MIGRATION_STATUS_FAILED;
> >> +       s->error = *errp;
> >> But my question was: is migrate_fd_cleanup called (where we will set the 
> >> status
> >> to FAILED if s->error is set) in the snapshot workflow?
> > 
> > I see what you meant.  It's not called indeed. We may need to process it
> > the same as what migrate_fd_cleanup() does.
> > 
> > So far the snapshot code reuses migration code in a partial way, so it's
> > not crystal clear where the line is, e.g., obviously it still moves the
> > migration state machine but it never shows "active" phase at all (even if
> > it has a major chunk of duration that it's actively migrating the data to
> > the snapshot files).  Here the state machine only goes from SETUP to either
> > FAILED or COMPLETED.
> > 
> > From that POV looks like we should duplicate such s->error detection logic
> > here on deciding whether to switch to FAILED, if we treat s->error as the
> > internal "single failure point" for migration.
> I hope setting the state to FAILED at the end of save_snapshot might work:
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1722,7 +1722,7 @@ cleanup:
>      qemu_savevm_state_cleanup();
> 
>      if (ret != 0) {
> -        qatomic_set(&ms->failing, 1);
> +        migrate_set_error(ms, *errp);
>      } else {
>          migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP,
>                            MIGRATION_STATUS_COMPLETED);
> @@ -3054,6 +3054,7 @@ bool save_snapshot(const char *name, bool overwrite, 
> const char *vmstate,
>      RunState saved_state = runstate_get();
>      uint64_t vm_state_size;
>      g_autoptr(GDateTime) now = g_date_time_new_now_local();
> +    MigrationState *ms;
> 
>      GLOBAL_STATE_CODE();
> 
> @@ -3149,8 +3150,13 @@ bool save_snapshot(const char *name, bool overwrite, 
> const char *vmstate,
> 
>   the_end:
>      bdrv_drain_all_end();
> -
> +    ms = migrate_get_current();
>      vm_resume(saved_state);
> +    if (migrate_has_error(ms)) {
> +        migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP,
> +                          MIGRATION_STATUS_FAILED);
> +    }
> +

I actually am not sure whether we need to postpone the FAILED update until
here for save_snapshot.

After all, merely the whole qemu_savevm_state() takes BQL, then there's no
way to race it with another "migrate" (QMP command "migrate" needs BQL too)
at this point.

OTOH, it also feels weird to update state in qemu_savevm_state_cleanup()..

Perhaps we can keep how qemu_savevm_state() would update FAILED state, then
we only need to update FAILED for precopy in migrate_fd_cleanup()?

We can still check for s->error in qemu_savevm_state(), though, because
after switching to a heavier use of s->error maybe we can fail the
save_snapshot() too with some s->error set (even if qemufile is still
working).  Then we may want to fail the save_snapshot() too.

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]