|
From: | Jason Wang |
Subject: | Re: [Qemu-stable] [PATCH 1/2] add VirtIONet vhost_stopped flag to prevent multiple stops |
Date: | Tue, 23 Apr 2019 10:58:57 +0800 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 |
On 2019/4/23 上午4:14, Dan Streetman wrote:
On Sun, Apr 21, 2019 at 10:50 PM Jason Wang <address@hidden> wrote:On 2019/4/17 上午2:46, Dan Streetman wrote:From: Dan Streetman <address@hidden> Buglink: https://launchpad.net/bugs/1823458 There is a race condition when using the vhost-user driver, between a guest shutdown and the vhost-user interface being closed. This is explained in more detail at the bug link above; the short explanation is the vhost-user device can be closed while the main thread is in the middle of stopping the vhost_net. In this case, the main thread handling shutdown will enter virtio_net_vhost_status() and move into the n->vhost_started (else) block, and call vhost_net_stop(); while it is running that function, another thread is notified that the vhost-user device has been closed, and (indirectly) calls into virtio_net_vhost_status() also.I think we need figure out why there are multiple vhost_net_stop() calls simultaneously. E.g vhost-user register fd handlers like: qemu_chr_fe_set_handlers(&s->chr, NULL, NULL, net_vhost_user_event, NULL, nc0->name, NULL, true); which uses default main context, so it should only be called only in main thread.net_vhost_user_event() schedules chr_closed_bh() to do its bottom half work; does aio_bh_schedule_oneshot() execute its events from the main thread?
I think so if net_vhost_user_event() was called in main thread (it calls qemu_get_current_aio_context()).
For reference, the call chain is: chr_closed_bh() qmp_set_link() nc->info->link_status_changed() -> virtio_net_set_link_status() virtio_net_set_status() virtio_net_vhost_status()
The code was added by Marc since: commit e7c83a885f865128ae3cf1946f8cb538b63cbfba Author: Marc-André Lureau <address@hidden> Date: Mon Feb 27 14:49:56 2017 +0400 vhost-user: delay vhost_user_stop Cc him for more thoughts. Thanks
ThanksSince the vhost_net status hasn't yet changed, the second thread also enters the n->vhost_started block, and also calls vhost_net_stop(). This causes problems for the second thread when it tries to stop the network that's already been stopped. This adds a flag to the struct that's atomically set to prevent more than one thread from calling vhost_net_stop(). The atomic_fetch_inc() is likely overkill and probably could be done with a simple check-and-set, but since it's a race condition there would still be a (very, very) small window without using an atomic to set it. Signed-off-by: Dan Streetman <address@hidden> --- hw/net/virtio-net.c | 3 ++- include/hw/virtio/virtio-net.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index ffe0872fff..d36f50d5dd 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -13,6 +13,7 @@ #include "qemu/osdep.h" #include "qemu/iov.h" +#include "qemu/atomic.h" #include "hw/virtio/virtio.h" #include "net/net.h" #include "net/checksum.h" @@ -240,7 +241,7 @@ static void virtio_net_vhost_status(VirtIONet *n, uint8_t status) "falling back on userspace virtio", -r); n->vhost_started = 0; } - } else { + } else if (atomic_fetch_inc(&n->vhost_stopped) == 0) { vhost_net_stop(vdev, n->nic->ncs, queues); n->vhost_started = 0; } diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h index b96f0c643f..d03fd933d0 100644 --- a/include/hw/virtio/virtio-net.h +++ b/include/hw/virtio/virtio-net.h @@ -164,6 +164,7 @@ struct VirtIONet { uint8_t nouni; uint8_t nobcast; uint8_t vhost_started; + int vhost_stopped; struct { uint32_t in_use; uint32_t first_multi;
[Prev in Thread] | Current Thread | [Next in Thread] |