qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] c06033: block: Fix qemu crash when using scsi


From: GitHub
Subject: [Qemu-commits] [qemu/qemu] c06033: block: Fix qemu crash when using scsi-block
Date: Fri, 09 Mar 2018 10:28:23 -0800

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: c060332c762acbe7899b01edda972da7f9ec9057
      
https://github.com/qemu/qemu/commit/c060332c762acbe7899b01edda972da7f9ec9057
  Author: Deepa Srinivasan <address@hidden>
  Date:   2018-03-08 (Thu, 08 Mar 2018)

  Changed paths:
    M block/block-backend.c

  Log Message:
  -----------
  block: Fix qemu crash when using scsi-block

Starting qemu with the following arguments causes qemu to segfault:
... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1

This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
details about the bug follow.

blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().

When blk_aio_ioctl() is executed from within a coroutine context (e.g.
iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
the current coroutine's wakeup queue. blk_aio_ioctl() then returns.

When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
....
    BlkRwCo *rwco = &acb->rwco;

    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
                       rwco->qiov->iov[0].iov_base);  <--- qiov is
                                                           invalid here
...

In the case when blk_aio_ioctl() is called from a non-coroutine context,
blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
execution is complete, control returns to blk_aio_ioctl_entry() after the call
to blk_co_ioctl(). There is no invalid reference after this point, but the
function is still holding on to invalid pointers.

The fix is to change blk_aio_prwv() to accept a void pointer for the IO buffer
rather than a QEMUIOVector. blk_aio_prwv() passes this through in BlkRwCo and 
the
coroutine function casts it to QEMUIOVector or uses the void pointer directly.

Signed-off-by: Deepa Srinivasan <address@hidden>
Signed-off-by: Konrad Rzeszutek Wilk <address@hidden>
Reviewed-by: Mark Kanda <address@hidden>
Reviewed-by: Paolo Bonzini <address@hidden>
Signed-off-by: Stefan Hajnoczi <address@hidden>


  Commit: 7c9e2748297d01222f5d5cbcdac7ec8126f1b139
      
https://github.com/qemu/qemu/commit/7c9e2748297d01222f5d5cbcdac7ec8126f1b139
  Author: Fam Zheng <address@hidden>
  Date:   2018-03-08 (Thu, 08 Mar 2018)

  Changed paths:
    M README

  Log Message:
  -----------
  README: Fix typo 'git-publish'

Reported-by: Alberto Garcia <address@hidden>
Signed-off-by: Fam Zheng <address@hidden>
Reviewed-by: Philippe Mathieu-Daudé <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>


  Commit: 12c1c7d7cefb4dbffeb5712e75a33e4692f0a76b
      
https://github.com/qemu/qemu/commit/12c1c7d7cefb4dbffeb5712e75a33e4692f0a76b
  Author: Sergio Lopez <address@hidden>
  Date:   2018-03-08 (Thu, 08 Mar 2018)

  Changed paths:
    M hw/block/dataplane/virtio-blk.c

  Log Message:
  -----------
  virtio-blk: dataplane: Don't batch notifications if EVENT_IDX is present

Commit 5b2ffbe4d99843fd8305c573a100047a8c962327 ("virtio-blk: dataplane:
notify guest as a batch") deferred guest notification to a BH in order
batch notifications, with purpose of avoiding flooding the guest with
interruptions.

This optimization came with a cost. The average latency perceived in the
guest is increased by a few microseconds, but also when multiple IO
operations finish at the same time, the guest won't be notified until
all completions from each operation has been run. On the contrary,
virtio-scsi issues the notification at the end of each completion.

On the other hand, nowadays we have the EVENT_IDX feature that allows a
better coordination between QEMU and the Guest OS to avoid sending
unnecessary interruptions.

With this change, virtio-blk/dataplane only batches notifications if the
EVENT_IDX feature is not present.

Some numbers obtained with fio (ioengine=sync, iodepth=1, direct=1):
 - Test specs:
   * fio-3.4 (ioengine=sync, iodepth=1, direct=1)
   * qemu master
   * virtio-blk with a dedicated iothread (default poll-max-ns)
   * backend: null_blk nr_devices=1 irqmode=2 completion_nsec=280000
   * 8 vCPUs pinned to isolated physical cores
   * Emulator and iothread also pinned to separate isolated cores
   * variance between runs < 1%

 - Not patched
   * numjobs=1:  lat_avg=327.32  irqs=29998
   * numjobs=4:  lat_avg=337.89  irqs=29073
   * numjobs=8:  lat_avg=342.98  irqs=28643

 - Patched:
   * numjobs=1:  lat_avg=323.92  irqs=30262
   * numjobs=4:  lat_avg=332.65  irqs=29520
   * numjobs=8:  lat_avg=335.54  irqs=29323

Signed-off-by: Sergio Lopez <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>


  Commit: b89d92f3cfc0f6e6d05e146e7a5fb8c759978051
      
https://github.com/qemu/qemu/commit/b89d92f3cfc0f6e6d05e146e7a5fb8c759978051
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2018-03-08 (Thu, 08 Mar 2018)

  Changed paths:
    M include/block/aio-wait.h
    M util/aio-wait.c

  Log Message:
  -----------
  block: add aio_wait_bh_oneshot()

Sometimes it's necessary for the main loop thread to run a BH in an
IOThread and wait for its completion.  This primitive is useful during
startup/shutdown to synchronize and avoid race conditions.

Signed-off-by: Stefan Hajnoczi <address@hidden>
Reviewed-by: Fam Zheng <address@hidden>
Acked-by: Paolo Bonzini <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>


  Commit: 1010cadf62332017648abee0d7a3dc7f2eef9632
      
https://github.com/qemu/qemu/commit/1010cadf62332017648abee0d7a3dc7f2eef9632
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2018-03-08 (Thu, 08 Mar 2018)

  Changed paths:
    M hw/block/dataplane/virtio-blk.c

  Log Message:
  -----------
  virtio-blk: fix race between .ioeventfd_stop() and vq handler

If the main loop thread invokes .ioeventfd_stop() just as the vq handler
function begins in the IOThread then the handler may lose the race for
the AioContext lock.  By the time the vq handler is able to acquire the
AioContext lock the ioeventfd has already been removed and the handler
isn't supposed to run anymore!

Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal
from within the IOThread.  This way no races with the vq handler are
possible.

Signed-off-by: Stefan Hajnoczi <address@hidden>
Reviewed-by: Fam Zheng <address@hidden>
Acked-by: Paolo Bonzini <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>


  Commit: 184b9623461e73ef79f5cebe68bdbcf2e4751e22
      
https://github.com/qemu/qemu/commit/184b9623461e73ef79f5cebe68bdbcf2e4751e22
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2018-03-08 (Thu, 08 Mar 2018)

  Changed paths:
    M hw/scsi/virtio-scsi-dataplane.c

  Log Message:
  -----------
  virtio-scsi: fix race between .ioeventfd_stop() and vq handler

If the main loop thread invokes .ioeventfd_stop() just as the vq handler
function begins in the IOThread then the handler may lose the race for
the AioContext lock.  By the time the vq handler is able to acquire the
AioContext lock the ioeventfd has already been removed and the handler
isn't supposed to run anymore!

Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal
from within the IOThread.  This way no races with the vq handler are
possible.

Signed-off-by: Stefan Hajnoczi <address@hidden>
Reviewed-by: Fam Zheng <address@hidden>
Acked-by: Paolo Bonzini <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>


  Commit: 4486e89c219c0d1b9bd8dfa0b1dd5b0d51ff2268
      
https://github.com/qemu/qemu/commit/4486e89c219c0d1b9bd8dfa0b1dd5b0d51ff2268
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2018-03-08 (Thu, 08 Mar 2018)

  Changed paths:
    M cpus.c
    M include/sysemu/iothread.h
    M include/sysemu/sysemu.h
    M iothread.c
    M vl.c

  Log Message:
  -----------
  vl: introduce vm_shutdown()

Commit 00d09fdbbae5f7864ce754913efc84c12fdf9f1a ("vl: pause vcpus before
stopping iothreads") and commit dce8921b2baaf95974af8176406881872067adfa
("iothread: Stop threads before main() quits") tried to work around the
fact that emulation was still active during termination by stopping
iothreads.  They suffer from race conditions:
1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the
   virtio_scsi_ctx_check() assertion failure because the BDS AioContext
   has been modified by iothread_stop_all().
2. Guest vq kick racing with main loop termination leaves a readable
   ioeventfd that is handled by the next aio_poll() when external
   clients are enabled again, resulting in unwanted emulation activity.

This patch obsoletes those commits by fully disabling emulation activity
when vcpus are stopped.

Use the new vm_shutdown() function instead of pause_all_vcpus() so that
vm change state handlers are invoked too.  Virtio devices will now stop
their ioeventfds, preventing further emulation activity after vm_stop().

Note that vm_stop(RUN_STATE_SHUTDOWN) cannot be used because it emits a
QMP STOP event that may affect existing clients.

It is no longer necessary to call replay_disable_events() directly since
vm_shutdown() does so already.

Drop iothread_stop_all() since it is no longer used.

Cc: Fam Zheng <address@hidden>
Cc: Kevin Wolf <address@hidden>
Signed-off-by: Stefan Hajnoczi <address@hidden>
Reviewed-by: Fam Zheng <address@hidden>
Acked-by: Paolo Bonzini <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>


  Commit: e4ae62b802cec437f877f2cadc4ef059cc0eca76
      
https://github.com/qemu/qemu/commit/e4ae62b802cec437f877f2cadc4ef059cc0eca76
  Author: Peter Maydell <address@hidden>
  Date:   2018-03-09 (Fri, 09 Mar 2018)

  Changed paths:
    M README
    M block/block-backend.c
    M cpus.c
    M hw/block/dataplane/virtio-blk.c
    M hw/scsi/virtio-scsi-dataplane.c
    M include/block/aio-wait.h
    M include/sysemu/iothread.h
    M include/sysemu/sysemu.h
    M iothread.c
    M util/aio-wait.c
    M vl.c

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into 
staging

# gpg: Signature made Fri 09 Mar 2018 13:19:02 GMT
# gpg:                using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <address@hidden>"
# gpg:                 aka "Stefan Hajnoczi <address@hidden>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  vl: introduce vm_shutdown()
  virtio-scsi: fix race between .ioeventfd_stop() and vq handler
  virtio-blk: fix race between .ioeventfd_stop() and vq handler
  block: add aio_wait_bh_oneshot()
  virtio-blk: dataplane: Don't batch notifications if EVENT_IDX is present
  README: Fix typo 'git-publish'
  block: Fix qemu crash when using scsi-block

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/b39b61e41002...e4ae62b802ce

reply via email to

[Prev in Thread] Current Thread [Next in Thread]