[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 1/2] migration: rework migrate_set_error() to migrate_repo
From: |
Peter Xu |
Subject: |
Re: [PATCH v2 1/2] migration: rework migrate_set_error() to migrate_report_err() |
Date: |
Wed, 24 Apr 2024 17:52:09 -0400 |
On Wed, Apr 24, 2024 at 10:50:14PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 24.04.24 22:40, Peter Xu wrote:
> > On Wed, Apr 24, 2024 at 08:42:44PM +0300, Vladimir Sementsov-Ogievskiy
> > wrote:
> > > 1. Most of callers want to free the error after call. Let's help them.
> > >
> > > 2. Some callers log the error, some not. We also have places where we
> > > log the stored error. Let's instead simply log every migration
> > > error.
> > >
> > > 3. Some callers have to retrieve current migration state only to pass
> > > it to migrate_set_error(). In the new helper let's get the state
> > > automatically.
> > >
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> > > ---
> > > migration/migration.c | 48 ++++++++++++----------------------------
> > > migration/migration.h | 2 +-
> > > migration/multifd.c | 18 ++++++---------
> > > migration/postcopy-ram.c | 3 +--
> > > migration/savevm.c | 16 +++++---------
> > > 5 files changed, 28 insertions(+), 59 deletions(-)
> > >
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 696762bc64..806b7b080b 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -285,7 +285,7 @@ void migration_bh_schedule(QEMUBHFunc *cb, void
> > > *opaque)
> > > void migration_cancel(const Error *error)
> > > {
> > > if (error) {
> > > - migrate_set_error(current_migration, error);
> > > + migrate_report_err(error_copy(error));
> > > }
> > > if (migrate_dirty_limit()) {
> > > qmp_cancel_vcpu_dirty_limit(false, -1, NULL);
> > > @@ -779,13 +779,6 @@ process_incoming_migration_co(void *opaque)
> > > }
> > > if (ret < 0) {
> > > - MigrationState *s = migrate_get_current();
> > > -
> > > - if (migrate_has_error(s)) {
> > > - WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
> > > - error_report_err(s->error);
> > > - }
> > > - }
> > > error_report("load of migration failed: %s", strerror(-ret));
> > > goto fail;
> > > }
> > > @@ -1402,10 +1395,6 @@ static void migrate_fd_cleanup(MigrationState *s)
> > > MIGRATION_STATUS_CANCELLED);
> > > }
> > > - if (s->error) {
> > > - /* It is used on info migrate. We can't free it */
> > > - error_report_err(error_copy(s->error));
> > > - }
> > > type = migration_has_failed(s) ? MIG_EVENT_PRECOPY_FAILED :
> > > MIG_EVENT_PRECOPY_DONE;
> > > migration_call_notifiers(s, type, NULL);
> > > @@ -1418,12 +1407,14 @@ static void migrate_fd_cleanup_bh(void *opaque)
> > > migrate_fd_cleanup(opaque);
> > > }
> > > -void migrate_set_error(MigrationState *s, const Error *error)
> > > +void migrate_report_err(Error *error)
> > > {
> > > + MigrationState *s = migrate_get_current();
> >
> > Avoid passing in *s looks ok.
> >
> > > QEMU_LOCK_GUARD(&s->error_mutex);
> > > if (!s->error) {
> > > s->error = error_copy(error);
> >
> > I think I get your point, but then looks like this error_copy() should be
> > removed but forgotten?
> >
> > I remember I had an attempt to do similarly (but only transfer the
> > ownership):
> >
> > https://lore.kernel.org/qemu-devel/20230829214235.69309-3-peterx@redhat.com/
> >
> > However I gave up later and I forgot why. One reason could be that I hit a
> > use-after-free, then found that well indeed leaking an Error is much better
> > than debugging a UAF.
>
> Hmm, yes I saw a leaked Error somewhere, and this patch should fix it. But
> may be I missed introducing this use-after-free again)
Ah do you mean you fixed a bug? If so please use a standalone patch for
that and we'll need to copy stable.
I did notice that in this series on patch looks like does more than one
thing. It'll be helpful too for reviewers when patch can be split
properly.
>
> >
> > So maybe we simply keep it like before? If you like such change, let's
> > just be extremely caucious.
> >
> > > }
> > > + error_report_err(error);
> >
> > The ideal case to me is we don't trigger an error_report() all over the
> > place. We're slightly going backward from that regard, IMHO..
> >
> > Ideally I think we shouldn't dump anything to stderr, but user should
> > always query from qmp.
> >
>
> Sounds reasonable to me. I'm just unsure, if keep it like before, how
> should I correctly update logging to stderr in
> process_incoming_migration_co(). Could we just drop error reporting to
> stderr? Or should it be kept as is for the case when exit_on_error is
> true?
I'm not sure I get the question, but I'll comment in patch 2 very soon, so
we can move the discussion there.
Thanks,
--
Peter Xu
[PATCH v2 2/2] qapi: introduce exit-on-error paramter for migrate-incoming, Vladimir Sementsov-Ogievskiy, 2024/04/24