[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/2] vhost-user: fix VirtQ notifier cleanup
From: |
Xueming(Steven) Li |
Subject: |
Re: [PATCH 1/2] vhost-user: fix VirtQ notifier cleanup |
Date: |
Tue, 14 Sep 2021 08:42:16 +0000 |
User-agent: |
Evolution 3.40.4 |
On Mon, 2021-09-13 at 00:20 +0800, Xueming Li wrote:
> When vhost-user device cleanup and unmmap notifier address, VM cpu
> thread that writing the notifier failed with accessing invalid address.
>
> To avoid this concurrent issue, wait memory flatview update by draining
> rcu callbacks, then unmap notifiers.
>
> Fixes: 44866521bd6e ("vhost-user: support registering external host
> notifiers")
> Cc: tiwei.bie@intel.com
> Cc: qemu-stable@nongnu.org
> Cc: Yuwei Zhang <zhangyuwei.9149@bytedance.com>
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> ---
> hw/virtio/vhost-user.c | 19 ++++++++++++-------
> 1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 2c8556237f..58722ab27c 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -1165,6 +1165,10 @@ static void
> vhost_user_host_notifier_remove(struct vhost_dev *dev,
>
> if (n->addr && n->set) {
> virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr,
> false);
> + /* Wait VM threads accessing old flatview which contains
> notifier. */
> + drain_call_rcu();
> + munmap(n->addr, qemu_real_host_page_size);
> + n->addr = NULL;
> n->set = false;
> }
> }
> @@ -1502,12 +1506,7 @@ static int
> vhost_user_slave_handle_vring_host_notifier(struct vhost_dev *dev,
>
> n = &user->notifier[queue_idx];
>
> - if (n->addr) {
> - virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr,
> false);
> - object_unparent(OBJECT(&n->mr));
> - munmap(n->addr, page_size);
> - n->addr = NULL;
> - }
> + vhost_user_host_notifier_remove(dev, queue_idx);
>
> if (area->u64 & VHOST_USER_VRING_NOFD_MASK) {
> return 0;
> @@ -2484,11 +2483,17 @@ void vhost_user_cleanup(VhostUserState *user)
> for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> if (user->notifier[i].addr) {
> object_unparent(OBJECT(&user->notifier[i].mr));
> + }
> + }
> + memory_region_transaction_commit();
> + /* Wait VM threads accessing old flatview which contains notifier.
> */
> + drain_call_rcu();
This RCU call works on main thread event on external vDPA application
shtudown. If driver reset device inside VM, event handling comes from
vCPU thread and dead lock here. Is there a limitation to call this API
from vCPU thread? Cc Maxim whom the author of this API.
On the other hand, when driver initiate a device reset, the VQ
accessing should being stopped, it's safe to skip draining RCU here.
vCPU threads set thread local variable "current_cpu", by checking it,
the dead lock disapear.
Here is the detail of dead lock thread:
Id Target Id Frame
* 1 Thread 0x7f5391d47000 (LWP 48691) "qemu-system-x86"
0x00007f538b23cb36 in ppoll () from /lib64/libc.so.6
2 Thread 0x7f537eae6700 (LWP 48692) "qemu-system-x86"
0x00007f538b24252d in syscall () from /lib64/libc.so.6
3 Thread 0x7f537caa8700 (LWP 48717) "qemu-system-x86"
0x00007f538b23e62b in ioctl () from /lib64/libc.so.6
4 Thread 0x7f526fdff700 (LWP 48718) "qemu-system-x86"
0x00007f538b23e62b in ioctl () from /lib64/libc.so.6
5 Thread 0x7f526f5fe700 (LWP 48719) "qemu-system-x86"
0x00007f538b23e62b in ioctl () from /lib64/libc.so.6
6 Thread 0x7f526edfd700 (LWP 48720) "qemu-system-x86"
0x00007f538b23e62b in ioctl () from /lib64/libc.so.6
7 Thread 0x7f526e5fc700 (LWP 48721) "qemu-system-x86"
0x00007f538b23e62b in ioctl () from /lib64/libc.so.6
8 Thread 0x7f526ddfb700 (LWP 48722) "qemu-system-x86"
0x00007f538b23e62b in ioctl () from /lib64/libc.so.6
9 Thread 0x7f526d5fa700 (LWP 48723) "qemu-system-x86"
0x00007f538b24252d in syscall () from /lib64/libc.so.6
10 Thread 0x7f526cdf9700 (LWP 48724) "qemu-system-x86"
0x00007f538b23e62b in ioctl () from /lib64/libc.so.6
34 Thread 0x7f522ca8b700 (LWP 49601) "qemu-system-x86"
0x00007f538c3cbdf2 in do_futex_wait () from /lib64/libpthread.so.0
(gdb) t 2
[Switching to thread 2 (Thread 0x7f537eae6700 (LWP 48692))]
#0 0x00007f538b24252d in syscall () from /lib64/libc.so.6
(gdb) bt
#0 0x00007f538b24252d in syscall () at /lib64/libc.so.6
#1 0x000055b53b9cbd28 in qemu_futex_wait (f=0x55b53c4656f8
<rcu_call_ready_event>, val=4294967295) at qemu/include/qemu/futex.h:29
#2 0x000055b53b9cbeec in qemu_event_wait (ev=0x55b53c4656f8
<rcu_call_ready_event>) at qemu/util/qemu-thread-posix.c:480
#3 0x000055b53b9d6c62 in call_rcu_thread (opaque=0x0) at
qemu/util/rcu.c:258
#4 0x000055b53b9cc09f in qemu_thread_start (args=0x55b53cd5aa70) at
qemu/util/qemu-thread-posix.c:541
#5 0x00007f538c3c314a in start_thread () at /lib64/libpthread.so.0
#6 0x00007f538b247dc3 in clone () at /lib64/libc.so.6
(gdb) t 9 // vCPU thread
[Switching to thread 9 (Thread 0x7f526d5fa700 (LWP 48723))]
#0 0x00007f538b24252d in syscall () from /lib64/libc.so.6
(gdb) bt
#0 0x00007f538b24252d in syscall () at /lib64/libc.so.6
#1 0x000055b53b9cbd28 in qemu_futex_wait (f=0x7f526d5f8d00,
val=4294967295) at qemu/include/qemu/futex.h:29
#2 0x000055b53b9cbeec in qemu_event_wait (ev=0x7f526d5f8d00) at
qemu/util/qemu-thread-posix.c:480
#3 0x000055b53b9d6dff in drain_call_rcu () at qemu/util/rcu.c:343
#4 0x000055b53b7737c3 in vhost_user_host_notifier_remove
(dev=0x55b53d0c5aa0, queue_idx=0) at qemu/hw/virtio/vhost-user.c:1156
#5 0x000055b53b773956 in vhost_user_get_vring_base
(dev=0x55b53d0c5aa0, ring=0x7f526d5f9040) at qemu/hw/virtio/vhost-
user.c:1198
#6 0x000055b53b76df29 in vhost_virtqueue_stop (dev=0x55b53d0c5aa0,
vdev=0x55b53deb60b0, vq=0x55b53d0c5d10, idx=0) at
qemu/hw/virtio/vhost.c:1219
#7 0x000055b53b76fb85 in vhost_dev_stop (hdev=0x55b53d0c5aa0,
vdev=0x55b53deb60b0) at qemu/hw/virtio/vhost.c:1837
#8 0x000055b53b4d7bb1 in vhost_net_stop_one (net=0x55b53d0c5aa0,
dev=0x55b53deb60b0) at qemu/hw/net/vhost_net.c:313
#9 0x000055b53b4d7ecb in vhost_net_stop (dev=0x55b53deb60b0,
ncs=0x55b53e244d50, total_queues=64) at qemu/hw/net/vhost_net.c:396
#10 0x000055b53b71f87d in virtio_net_vhost_status (n=0x55b53deb60b0,
status=0 '\000') at qemu/hw/net/virtio-net.c:295
#11 0x000055b53b71fada in virtio_net_set_status (vdev=0x55b53deb60b0,
status=0 '\000') at qemu/hw/net/virtio-net.c:369
#12 0x000055b53b765844 in virtio_set_status (vdev=0x55b53deb60b0, val=0
'\000') at qemu/hw/virtio/virtio.c:1956
#13 0x000055b53b5c5d58 in virtio_pci_common_write
(opaque=0x55b53deade30, addr=20, val=0, size=1) at
qemu/hw/virtio/virtio-pci.c:1292
#14 0x000055b53b6c85fd in memory_region_write_accessor
(mr=0x55b53deae890, addr=20, value=0x7f526d5f9338, size=1, shift=0,
mask=255, attrs=...)
at qemu/softmmu/memory.c:492
#15 0x000055b53b6c8838 in access_with_adjusted_size (addr=20,
value=0x7f526d5f9338, size=1, access_size_min=1, access_size_max=4,
access_fn=
0x55b53b6c850b <memory_region_write_accessor>, mr=0x55b53deae890,
attrs=...) at qemu/softmmu/memory.c:554
#16 0x000055b53b6cb837 in memory_region_dispatch_write
(mr=0x55b53deae890, addr=20, data=0, op=MO_8, attrs=...) at
qemu/softmmu/memory.c:1504
#17 0x000055b53b6bebee in flatview_write_continue (fv=0x7f5264003af0,
addr=4261412884, attrs=..., ptr=0x7f5391b97028, len=1, addr1=20, l=1,
mr=0x55b53deae890)
at qemu/softmmu/physmem.c:2780
#18 0x000055b53b6bed33 in flatview_write (fv=0x7f5264003af0,
addr=4261412884, attrs=..., buf=0x7f5391b97028, len=1) at
qemu/softmmu/physmem.c:2820
#19 0x000055b53b6bf09f in address_space_write (as=0x55b53c4460a0
<address_space_memory>, addr=4261412884, attrs=..., buf=0x7f5391b97028,
len=1)
at qemu/softmmu/physmem.c:2912
#20 0x000055b53b6bf10c in address_space_rw (as=0x55b53c4460a0
<address_space_memory>, addr=4261412884, attrs=..., buf=0x7f5391b97028,
len=1, is_write=true)
at qemu/softmmu/physmem.c:2922
#21 0x000055b53b7e3bfd in kvm_cpu_exec (cpu=0x55b53d200b30) at
qemu/accel/kvm/kvm-all.c:2893
#22 0x000055b53b7e59c6 in kvm_vcpu_thread_fn (arg=0x55b53d200b30) at
qemu/accel/kvm/kvm-accel-ops.c:49
#23 0x000055b53b9cc09f in qemu_thread_start (args=0x55b53d20da00) at
qemu/util/qemu-thread-posix.c:541
#24 0x00007f538c3c314a in start_thread () at /lib64/libpthread.so.0
#25 0x00007f538b247dc3 in clone () at /lib64/libc.so.6
> + for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> + if (user->notifier[i].addr) {
> munmap(user->notifier[i].addr, qemu_real_host_page_size);
> user->notifier[i].addr = NULL;
> }
> }
> - memory_region_transaction_commit();
> user->chr = NULL;
> }
>