[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-stable] [PATCH 1/2] add VirtIONet vhost_stopped flag to preven
From: |
Dan Streetman |
Subject: |
Re: [Qemu-stable] [PATCH 1/2] add VirtIONet vhost_stopped flag to prevent multiple stops |
Date: |
Tue, 23 Apr 2019 04:49:57 -0400 |
On Mon, Apr 22, 2019 at 10:59 PM Jason Wang <address@hidden> wrote:
>
>
> On 2019/4/23 上午4:14, Dan Streetman wrote:
> > On Sun, Apr 21, 2019 at 10:50 PM Jason Wang <address@hidden> wrote:
> >>
> >> On 2019/4/17 上午2:46, Dan Streetman wrote:
> >>> From: Dan Streetman <address@hidden>
> >>>
> >>> Buglink: https://launchpad.net/bugs/1823458
> >>>
> >>> There is a race condition when using the vhost-user driver, between a
> >>> guest
> >>> shutdown and the vhost-user interface being closed. This is explained in
> >>> more detail at the bug link above; the short explanation is the vhost-user
> >>> device can be closed while the main thread is in the middle of stopping
> >>> the vhost_net. In this case, the main thread handling shutdown will
> >>> enter virtio_net_vhost_status() and move into the n->vhost_started (else)
> >>> block, and call vhost_net_stop(); while it is running that function,
> >>> another thread is notified that the vhost-user device has been closed,
> >>> and (indirectly) calls into virtio_net_vhost_status() also.
> >>
> >> I think we need figure out why there are multiple vhost_net_stop() calls
> >> simultaneously. E.g vhost-user register fd handlers like:
> >>
> >> qemu_chr_fe_set_handlers(&s->chr, NULL, NULL,
> >> net_vhost_user_event, NULL, nc0->name,
> >> NULL,
> >> true);
> >>
> >> which uses default main context, so it should only be called only in
> >> main thread.
> > net_vhost_user_event() schedules chr_closed_bh() to do its bottom half
> > work; does aio_bh_schedule_oneshot() execute its events from the main
> > thread?
>
>
> I think so if net_vhost_user_event() was called in main thread (it calls
> qemu_get_current_aio_context()).
ok, I'll check that, thanks!
I think my other patch, to remove the vhost_user_stop() call
completely from the net_vhost_user_event() handler for
CHR_EVENT_CLOSED, is still relevant; do you have thoughts on that?
>
>
> >
> > For reference, the call chain is:
> >
> > chr_closed_bh()
> > qmp_set_link()
> > nc->info->link_status_changed() -> virtio_net_set_link_status()
> > virtio_net_set_status()
> > virtio_net_vhost_status()
>
>
> The code was added by Marc since:
>
> commit e7c83a885f865128ae3cf1946f8cb538b63cbfba
> Author: Marc-André Lureau <address@hidden>
> Date: Mon Feb 27 14:49:56 2017 +0400
>
> vhost-user: delay vhost_user_stop
>
> Cc him for more thoughts.
>
> Thanks
>
>
> >> Thanks
> >>
> >>
> >>> Since the
> >>> vhost_net status hasn't yet changed, the second thread also enters
> >>> the n->vhost_started block, and also calls vhost_net_stop(). This
> >>> causes problems for the second thread when it tries to stop the network
> >>> that's already been stopped.
> >>>
> >>> This adds a flag to the struct that's atomically set to prevent more than
> >>> one thread from calling vhost_net_stop(). The atomic_fetch_inc() is
> >>> likely
> >>> overkill and probably could be done with a simple check-and-set, but
> >>> since it's a race condition there would still be a (very, very) small
> >>> window without using an atomic to set it.
> >>>
> >>> Signed-off-by: Dan Streetman <address@hidden>
> >>> ---
> >>> hw/net/virtio-net.c | 3 ++-
> >>> include/hw/virtio/virtio-net.h | 1 +
> >>> 2 files changed, 3 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> >>> index ffe0872fff..d36f50d5dd 100644
> >>> --- a/hw/net/virtio-net.c
> >>> +++ b/hw/net/virtio-net.c
> >>> @@ -13,6 +13,7 @@
> >>>
> >>> #include "qemu/osdep.h"
> >>> #include "qemu/iov.h"
> >>> +#include "qemu/atomic.h"
> >>> #include "hw/virtio/virtio.h"
> >>> #include "net/net.h"
> >>> #include "net/checksum.h"
> >>> @@ -240,7 +241,7 @@ static void virtio_net_vhost_status(VirtIONet *n,
> >>> uint8_t status)
> >>> "falling back on userspace virtio", -r);
> >>> n->vhost_started = 0;
> >>> }
> >>> - } else {
> >>> + } else if (atomic_fetch_inc(&n->vhost_stopped) == 0) {
> >>> vhost_net_stop(vdev, n->nic->ncs, queues);
> >>> n->vhost_started = 0;
> >>> }
> >>> diff --git a/include/hw/virtio/virtio-net.h
> >>> b/include/hw/virtio/virtio-net.h
> >>> index b96f0c643f..d03fd933d0 100644
> >>> --- a/include/hw/virtio/virtio-net.h
> >>> +++ b/include/hw/virtio/virtio-net.h
> >>> @@ -164,6 +164,7 @@ struct VirtIONet {
> >>> uint8_t nouni;
> >>> uint8_t nobcast;
> >>> uint8_t vhost_started;
> >>> + int vhost_stopped;
> >>> struct {
> >>> uint32_t in_use;
> >>> uint32_t first_multi;
[Qemu-stable] [PATCH 2/2] do not call vhost_net_cleanup() on running net from char user event, Dan Streetman, 2019/04/16
Re: [Qemu-stable] [PATCH 0/2] vhost-user race condition on shutdown, Michael S. Tsirkin, 2019/04/19