qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 5/5] vhost: add device started check in migration set log


From: Raphael Norwitz
Subject: Re: [PATCH v2 5/5] vhost: add device started check in migration set log
Date: Sun, 10 May 2020 20:03:39 -0400

On Thu, May 7, 2020 at 11:35 AM Dima Stepanov <address@hidden> wrote:
>
> What do you think?
>

Apologies - I tripped over the if (dev->started && r < 0) check.
Never-mind my point with race conditions and failing migrations.

Rather than modifying vhost_dev_set_log(), it may be clearer to put a
check after vhost_dev_log_resize()? Something like:

--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -829,11 +829,22 @@ static int vhost_migration_log(MemoryListener
*listener, int enable)
         vhost_log_put(dev, false);
     } else {
         vhost_dev_log_resize(dev, vhost_get_log_size(dev));
+        /*
+         * A device can be stopped because of backend disconnect inside
+         * vhost_dev_log_resize(). In this case we should mark logging
+         * enabled and return without attempting to set the backend
+         * logging state.
+         */
+        if (!dev->started) {
+            goto out_success;
+        }
         r = vhost_dev_set_log(dev, true);
         if (r < 0) {
             return r;
         }
     }
+
+out_success:
     dev->log_enabled = enable;
     return 0;
 }

This seems harmless enough to me, and I see how it fixes your
particular crash, but I would still prefer we worked towards a more
robust solution. In particular I think we could handle this inside
vhost-user-blk if we let the device state persist between connections
(i.e. call vhost_dev_cleanup() inside vhost_user_blk_connect() before
vhost_dev_init() on reconnect). This should also fix some of the
crashes Li Feng has hit, and probably others which haven’t been
reported yet. What do you think?

If that’s unworkable I guess we will need to add these vhost level
checks. In that case I would still prefer we add a “disconnected” flag
in struct vhost_dev struct, and make sure it isn’t cleared by
vhost_dev_cleanup(). That way we don’t conflate stopping a device with
backend disconnect at the vhost level and potentially regress behavior
for other device types.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]