Re: [PATCH v2 4/5] vhost: check vring address before calling unmap

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 4/5] vhost: check vring address before calling unmap

From:	Jason Wang
Subject:	Re: [PATCH v2 4/5] vhost: check vring address before calling unmap
Date:	Thu, 14 May 2020 15:28:59 +0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0


On 2020/5/13 下午5:36, Dima Stepanov wrote:

On Wed, May 13, 2020 at 11:00:38AM +0800, Jason Wang wrote:

On 2020/5/12 下午5:08, Dima Stepanov wrote:

On Tue, May 12, 2020 at 11:26:11AM +0800, Jason Wang wrote:

On 2020/5/11 下午5:11, Dima Stepanov wrote:

On Mon, May 11, 2020 at 11:05:58AM +0800, Jason Wang wrote:

On 2020/4/30 下午9:36, Dima Stepanov wrote:

Since disconnect can happen at any time during initialization not all
vring buffers (for instance used vring) can be intialized successfully.
If the buffer was not initialized then vhost_memory_unmap call will lead
to SIGSEGV. Add checks for the vring address value before calling unmap.
Also add assert() in the vhost_memory_unmap() routine.

Signed-off-by: Dima Stepanov<address@hidden>
---
  hw/virtio/vhost.c | 27 +++++++++++++++++++++------
  1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index ddbdc53..3ee50c4 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -314,6 +314,8 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void 
*buffer,
                                 hwaddr len, int is_write,
                                 hwaddr access_len)
  {
+    assert(buffer);
+
      if (!vhost_dev_has_iommu(dev)) {
          cpu_physical_memory_unmap(buffer, len, is_write, access_len);
      }
@@ -1132,12 +1134,25 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
                                                  vhost_vq_index);
      }
-    vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx),
-                       1, virtio_queue_get_used_size(vdev, idx));
-    vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx),
-                       0, virtio_queue_get_avail_size(vdev, idx));
-    vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
-                       0, virtio_queue_get_desc_size(vdev, idx));
+    /*
+     * Since the vhost-user disconnect can happen during initialization
+     * check if vring was initialized, before making unmap.
+     */
+    if (vq->used) {
+        vhost_memory_unmap(dev, vq->used,
+                           virtio_queue_get_used_size(vdev, idx),
+                           1, virtio_queue_get_used_size(vdev, idx));
+    }
+    if (vq->avail) {
+        vhost_memory_unmap(dev, vq->avail,
+                           virtio_queue_get_avail_size(vdev, idx),
+                           0, virtio_queue_get_avail_size(vdev, idx));
+    }
+    if (vq->desc) {
+        vhost_memory_unmap(dev, vq->desc,
+                           virtio_queue_get_desc_size(vdev, idx),
+                           0, virtio_queue_get_desc_size(vdev, idx));
+    }

Any reason not checking hdev->started instead? vhost_dev_start() will set it
to true if virtqueues were correctly mapped.

Thanks

Well i see it a little bit different:
  - vhost_dev_start() sets hdev->started to true before starting
    virtqueues
  - vhost_virtqueue_start() maps all the memory
If we hit the vhost disconnect at the start of the
vhost_virtqueue_start(), for instance for this call:
   r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
Then we will call vhost_user_blk_disconnect:
   vhost_user_blk_disconnect()->
     vhost_user_blk_stop()->
       vhost_dev_stop()->
         vhost_virtqueue_stop()
As a result we will come in this routine with the hdev->started still
set to true, but if used/avail/desc fields still uninitialized and set
to 0.

I may miss something, but consider both vhost_dev_start() and
vhost_user_blk_disconnect() were serialized in main loop. Can this really
happen?

Yes, consider the case when we start the vhost-user-blk device:
   vhost_dev_start->
     vhost_virtqueue_start
And we got a disconnect in the middle of vhost_virtqueue_start()
routine, for instance:
   1000     vq->num = state.num = virtio_queue_get_num(vdev, idx);
   1001     r = dev->vhost_ops->vhost_set_vring_num(dev, &state);
   1002     if (r) {
   1003         VHOST_OPS_DEBUG("vhost_set_vring_num failed");
   1004         return -errno;
   1005     }
   --> Here we got a disconnect <--
   1006
   1007     state.num = virtio_queue_get_last_avail_idx(vdev, idx);
   1008     r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
   1009     if (r) {
   1010         VHOST_OPS_DEBUG("vhost_set_vring_base failed");
   1011         return -errno;
   1012     }
As a result call to vhost_set_vring_base will call the disconnect
routine. The backtrace log for SIGSEGV is as follows:
   Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
   [Switching to Thread 0x7ffff2ea9700 (LWP 183150)]
   0x00007ffff4d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
   (gdb) bt
   #0  0x00007ffff4d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
   #1  0x000055555590fd90 in flatview_write_continue (fv=0x7fffec4a2600,
       addr=0, attrs=..., ptr=0x0, len=1028, addr1=0,
       l=1028, mr=0x555556b1b310) at ./exec.c:3142
   #2  0x000055555590fe98 in flatview_write (fv=0x7fffec4a2600, addr=0,
       attrs=..., buf=0x0, len=1028) at ./exec.c:3177
   #3  0x00005555559101ed in address_space_write (as=0x555556893940
       <address_space_memory>, addr=0, attrs=..., buf=0x0,
       len=1028) at ./exec.c:3268
   #4  0x0000555555910caf in address_space_unmap (as=0x555556893940
       <address_space_memory>, buffer=0x0, len=1028,
       is_write=true, access_len=1028) at ./exec.c:3592
   #5  0x0000555555910d82 in cpu_physical_memory_unmap (buffer=0x0,
       len=1028, is_write=true, access_len=1028) at ./exec.c:3613
   #6  0x0000555555a16fa1 in vhost_memory_unmap (dev=0x7ffff22723e8,
       buffer=0x0, len=1028, is_write=1, access_len=1028)
       at ./hw/virtio/vhost.c:318
   #7  0x0000555555a192a2 in vhost_virtqueue_stop (dev=0x7ffff22723e8,
       vdev=0x7ffff22721a0, vq=0x55555770abf0, idx=0) at 
./hw/virtio/vhost.c:1136
   #8  0x0000555555a1acc0 in vhost_dev_stop (hdev=0x7ffff22723e8,
       vdev=0x7ffff22721a0) at ./hw/virtio/vhost.c:1702
   #9  0x00005555559b6532 in vhost_user_blk_stop (vdev=0x7ffff22721a0)
       at ./hw/block/vhost-user-blk.c:196
   #10 0x00005555559b6b73 in vhost_user_blk_disconnect (dev=0x7ffff22721a0)
       at ./hw/block/vhost-user-blk.c:365
   #11 0x00005555559b6c4f in vhost_user_blk_event (opaque=0x7ffff22721a0,
       event=CHR_EVENT_CLOSED) at ./hw/block/vhost-user-blk.c:384
   #12 0x0000555555e65f7e in chr_be_event (s=0x555556b182e0, 
event=CHR_EVENT_CLOSED)
       at chardev/char.c:60
   #13 0x0000555555e6601a in qemu_chr_be_event (s=0x555556b182e0,
       event=CHR_EVENT_CLOSED) at chardev/char.c:80
   #14 0x0000555555e6eef3 in tcp_chr_disconnect_locked (chr=0x555556b182e0)
       at chardev/char-socket.c:488
   #15 0x0000555555e6e23f in tcp_chr_write (chr=0x555556b182e0,
       buf=0x7ffff2ea8220 "\n", len=20) at chardev/char-socket.c:178
   #16 0x0000555555e6616c in qemu_chr_write_buffer (s=0x555556b182e0,
       buf=0x7ffff2ea8220 "\n", len=20, offset=0x7ffff2ea8150, write_all=true)
       at chardev/char.c:120
   #17 0x0000555555e662d9 in qemu_chr_write (s=0x555556b182e0, buf=0x7ffff2ea8220 
"\n",
       len=20, write_all=true) at chardev/char.c:155
   #18 0x0000555555e693cc in qemu_chr_fe_write_all (be=0x7ffff2272360,
       buf=0x7ffff2ea8220 "\n", len=20) at chardev/char-fe.c:53
   #19 0x0000555555a1c489 in vhost_user_write (dev=0x7ffff22723e8,
       msg=0x7ffff2ea8220, fds=0x0, fd_num=0) at ./hw/virtio/vhost-user.c:350
   #20 0x0000555555a1d325 in vhost_set_vring (dev=0x7ffff22723e8,
       request=10, ring=0x7ffff2ea8520) at ./hw/virtio/vhost-user.c:660
   #21 0x0000555555a1d4c6 in vhost_user_set_vring_base (dev=0x7ffff22723e8,
       ring=0x7ffff2ea8520) at ./hw/virtio/vhost-user.c:704
   #22 0x0000555555a18c1b in vhost_virtqueue_start (dev=0x7ffff22723e8,
       vdev=0x7ffff22721a0, vq=0x55555770abf0, idx=0) at 
./hw/virtio/vhost.c:1009
   #23 0x0000555555a1a9f5 in vhost_dev_start (hdev=0x7ffff22723e8, 
vdev=0x7ffff22721a0)
       at ./hw/virtio/vhost.c:1639
   #24 0x00005555559b6367 in vhost_user_blk_start (vdev=0x7ffff22721a0)
       at ./hw/block/vhost-user-blk.c:150
   #25 0x00005555559b6653 in vhost_user_blk_set_status (vdev=0x7ffff22721a0, 
status=15 '\017')
       at ./hw/block/vhost-user-blk.c:233
   #26 0x0000555555a1072d in virtio_set_status (vdev=0x7ffff22721a0, val=15 
'\017')
       at ./hw/virtio/virtio.c:1956
---Type <return> to continue, or q <return> to quit---

So while we inside vhost_user_blk_start() (frame#24) we are calling
vhost_user_blk_disconnect() (frame#10). And for this call the
dev->started field will be set to true.
   (gdb) frame 8
   #8  0x0000555555a1acc0 in vhost_dev_stop (hdev=0x7ffff22723e8, 
vdev=0x7ffff22721a0)
       at ./hw/virtio/vhost.c:1702
   1702            vhost_virtqueue_stop(hdev,
   (gdb) p hdev->started
   $1 = true

It isn't an easy race to reproduce: to hit a disconnect at this window.
We were able to hit it during long testing on one of the reconnect
iteration. After it we were able to reproduce it with 100% by adding a
sleep() call inside qemu to make the race window bigger.

Thanks for the patience.

I miss the fact that the disconnection routine could be triggered from
chardev write.

But the codes turns out to be very tricky and hard to debug since the code
was wrote to deal with the error returned from vhost_ops directly, it
doesn't expect vhost_dev_cleanup() was call silently for each
vhost_user_write(). It would introduce troubles if we want to add new
codes/operations to vhost.

More questions:

Well these are good questions.

- Do we need to have some checking against hdev->started in each vhost user
ops?

I can miss smth but it looks like that no. The vhost_dev_set_log()
routine has some additional logic with the vhost_virtqueue_set_addr()
and vhost_dev_set_features() this why we need those additional checks in
the migration code. For the vhost-user devices initialization or
deinitalization code we don't need those checks. Even more we couldn't
add this check to the vhost user ops, since it ops itself will be reset.
And i agree that even if we have no issues right now, by adding new code
and missing these cases we could break smth. Because review process
becomes harder.



Yes, such codes would be very hard to maintain.


Maybe the good idea is postphone clean up, if we are in the middle of the
initialization. So it could be smth like:
   - Put ourself in the INITIALIZATION state
   - Start these vhost-user "handshake" commands
   - If we got a disconnect error, perform disconnect, but don't clean up
     device (it will be clean up on the roll back). I can be done by
     checking the state in vhost_user_..._disconnect routine or smth like
     it
   - vhost-user command returns error back to the _start() routine
   - Rollback in one place in the start() routine, by calling this
     postphoned clean up for the disconnect

Michael wrote that there was similar fix for vhost-net. Can't say for
now, could it help or not, but need to take a look on it also.



Right.

Thanks

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v2 4/5] vhost: check vring address before calling unmap, Raphael Norwitz, 2020/05/03
- Re: [PATCH v2 4/5] vhost: check vring address before calling unmap, Jason Wang, 2020/05/10
  - Re: [PATCH v2 4/5] vhost: check vring address before calling unmap, Dima Stepanov, 2020/05/11
    - Re: [PATCH v2 4/5] vhost: check vring address before calling unmap, Jason Wang, 2020/05/11
    - Re: [PATCH v2 4/5] vhost: check vring address before calling unmap, Dima Stepanov, 2020/05/12
    - Re: [PATCH v2 4/5] vhost: check vring address before calling unmap, Jason Wang, 2020/05/12
    - Re: [PATCH v2 4/5] vhost: check vring address before calling unmap, Dima Stepanov, 2020/05/13
    - Re: [PATCH v2 4/5] vhost: check vring address before calling unmap, Jason Wang <=

Prev by Date: Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Next by Date: Re: [PATCH v2 5/5] vhost: add device started check in migration set log
Previous by thread: Re: [PATCH v2 4/5] vhost: check vring address before calling unmap
Next by thread: Re: [PATCH v22 3/4] qcow2: add zstd cluster compression
Index(es):
- Date
- Thread