qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-stable] [Qemu-block] [Qemu-devel] [PATCH] block/backup: instal


From: Stefan Hajnoczi
Subject: Re: [Qemu-stable] [Qemu-block] [Qemu-devel] [PATCH] block/backup: install notifier during creation
Date: Tue, 10 Sep 2019 10:19:42 +0200
User-agent: Mutt/1.12.1 (2019-06-15)

On Wed, Aug 21, 2019 at 04:01:52PM -0400, John Snow wrote:
> 
> 
> On 8/21/19 10:41 AM, Vladimir Sementsov-Ogievskiy wrote:
> > 09.08.2019 23:13, John Snow wrote:
> >> Backup jobs may yield prior to installing their handler, because of the
> >> job_co_entry shim which guarantees that a job won't begin work until
> >> we are ready to start an entire transaction.
> >>
> >> Unfortunately, this makes proving correctness about transactional
> >> points-in-time for backup hard to reason about. Make it explicitly clear
> >> by moving the handler registration to creation time, and changing the
> >> write notifier to a no-op until the job is started.
> >>
> >> Reported-by: Vladimir Sementsov-Ogievskiy <address@hidden>
> >> Signed-off-by: John Snow <address@hidden>
> >> ---
> >>   block/backup.c     | 32 +++++++++++++++++++++++---------
> >>   include/qemu/job.h |  5 +++++
> >>   job.c              |  2 +-
> >>   3 files changed, 29 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/block/backup.c b/block/backup.c
> >> index 07d751aea4..4df5b95415 100644
> >> --- a/block/backup.c
> >> +++ b/block/backup.c
> >> @@ -344,6 +344,13 @@ static int coroutine_fn backup_before_write_notify(
> >>       assert(QEMU_IS_ALIGNED(req->offset, BDRV_SECTOR_SIZE));
> >>       assert(QEMU_IS_ALIGNED(req->bytes, BDRV_SECTOR_SIZE));
> >>   
> >> +    /* The handler is installed at creation time; the actual point-in-time
> >> +     * starts at job_start(). Transactions guarantee those two points are
> >> +     * the same point in time. */
> >> +    if (!job_started(&job->common.job)) {
> >> +        return 0;
> >> +    }
> > 
> > Hmm, sorry if it is a stupid question, I'm not good in multiprocessing and 
> > in
> > Qemu iothreads..
> > 
> > job_started just reads job->co. If bs runs in iothread, and therefore 
> > write-notifier
> > is in iothread, when job_start is called from main thread.. Is it 
> > guaranteed that
> > write-notifier will see job->co variable change early enough to not miss 
> > guest write?
> > Should not job->co be volatile for example or something like this?
> > 
> > If not think about this patch looks good for me.
> > 
> 
> You know, it's a really good question.
> So good, in fact, that I have no idea.
> 
> ¯\_(ツ)_/¯
> 
> I'm fairly certain that IO will not come in until the .clean phase of a
> qmp_transaction, because bdrv_drained_begin(bs) is called during
> .prepare, and we activate the handler (by starting the job) in .commit.
> We do not end the drained section until .clean.
> 
> I'm not fully clear on what threading guarantees we have otherwise,
> though; is it possible that "Thread A" would somehow lift the bdrv_drain
> on an IO thread ("Thread B") and, after that, "Thread B" would somehow
> still be able to see an outdated version of job->co that was set by
> "Thread A"?
> 
> I doubt it; but I can't prove it.

In the qmp_backup() case (not qmp_transaction()) there is:

  void qmp_drive_backup(DriveBackup *arg, Error **errp)
  {

      BlockJob *job;
      job = do_drive_backup(arg, NULL, errp);
      if (job) {
          job_start(&job->job);
      }
  }

job_start() is called without any thread synchronization, which is
usually fine because the coroutine doesn't run until job_start() calls
aio_co_enter().

Now that the before write notifier has been installed early, there is
indeed a race between job_start() and the write notifier accessing
job->co from an IOThread.

The write before notifier might see job->co != NULL before job_start()
has finished.  This could lead to issues if job_*() APIs are invoked by
the write notifier and access an in-between job state.

A safer approach is to set a BackupBlockJob variable at the beginning of
backup_run() and check it from the before write notifier.

That said, I don't understand the benefit of this patch and IMO it makes
the code harder to understand because now we need to think about the
created but not started state too.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]