qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [BUG] vhost-vdpa: qemu-system-s390x crashes with second virtio-net-c


From: Cornelia Huck
Subject: Re: [BUG] vhost-vdpa: qemu-system-s390x crashes with second virtio-net-ccw device
Date: Mon, 27 Jul 2020 10:41:48 +0200

On Mon, 27 Jul 2020 15:38:12 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On 2020/7/27 下午2:43, Cornelia Huck wrote:
> > On Sat, 25 Jul 2020 08:40:07 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> >  
> >> On 2020/7/24 下午11:34, Cornelia Huck wrote:  
> >>> On Fri, 24 Jul 2020 11:17:57 -0400
> >>> "Michael S. Tsirkin"<mst@redhat.com>  wrote:
> >>>     
> >>>> On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote:  
> >>>>> On Fri, 24 Jul 2020 09:30:58 -0400
> >>>>> "Michael S. Tsirkin"<mst@redhat.com>  wrote:
> >>>>>         
> >>>>>> On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote:  
> >>>>>>> When I start qemu with a second virtio-net-ccw device (i.e. adding
> >>>>>>> -device virtio-net-ccw in addition to the autogenerated device), I get
> >>>>>>> a segfault. gdb points to
> >>>>>>>
> >>>>>>> #0  0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>,
> >>>>>>>       config=0x55d6ad9e3f80 "RT") at 
> >>>>>>> /home/cohuck/git/qemu/hw/net/virtio-net.c:146
> >>>>>>> 146       if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> >>>>>>>
> >>>>>>> (backtrace doesn't go further)  
> >>>>> The core was incomplete, but running under gdb directly shows that it
> >>>>> is just a bog-standard config space access (first for that device).
> >>>>>
> >>>>> The cause of the crash is that nc->peer is not set... no idea how that
> >>>>> can happen, not that familiar with that part of QEMU. (Should the code
> >>>>> check, or is that really something that should not happen?)
> >>>>>
> >>>>> What I don't understand is why it is set correctly for the first,
> >>>>> autogenerated virtio-net-ccw device, but not for the second one, and
> >>>>> why virtio-net-pci doesn't show these problems. The only difference
> >>>>> between -ccw and -pci that comes to my mind here is that config space
> >>>>> accesses for ccw are done via an asynchronous operation, so timing
> >>>>> might be different.  
> >>>> Hopefully Jason has an idea. Could you post a full command line
> >>>> please? Do you need a working guest to trigger this? Does this trigger
> >>>> on an x86 host?  
> >>> Yes, it does trigger with tcg-on-x86 as well. I've been using
> >>>
> >>> s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu 
> >>> qemu,zpci=on
> >>> -m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001
> >>> -drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0
> >>> -device 
> >>> scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1
> >>> -device virtio-net-ccw
> >>>
> >>> It seems it needs the guest actually doing something with the nics; I
> >>> cannot reproduce the crash if I use the old advent calendar moon buggy
> >>> image and just add a virtio-net-ccw device.
> >>>
> >>> (I don't think it's a problem with my local build, as I see the problem
> >>> both on my laptop and on an LPAR.)  
> >>
> >> It looks to me we forget the check the existence of peer.
> >>
> >> Please try the attached patch to see if it works.  
> > Thanks, that patch gets my guest up and running again. So, FWIW,
> >
> > Tested-by: Cornelia Huck <cohuck@redhat.com>
> >
> > Any idea why this did not hit with virtio-net-pci (or the autogenerated
> > virtio-net-ccw device)?  
> 
> 
> It can be hit with virtio-net-pci as well (just start without peer).

Hm, I had not been able to reproduce the crash with a 'naked' -device
virtio-net-pci. But checking seems to be the right idea anyway.

> 
> For autogenerated virtio-net-cww, I think the reason is that it has 
> already had a peer set.

Ok, that might well be.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]