qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Question on memory commit during MR finalize()


From: Thanos Makatos
Subject: RE: Question on memory commit during MR finalize()
Date: Mon, 19 Jul 2021 19:05:29 +0000

> -----Original Message-----
> From: Qemu-devel <qemu-devel-
> bounces+thanos.makatos=nutanix.com@nongnu.org> On Behalf Of Thanos
> Makatos
> Sent: 19 July 2021 19:02
> To: Peter Xu <peterx@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>; John Levon
> <john.levon@nutanix.com>; John G Johnson <john.g.johnson@oracle.com>;
> Markus Armbruster <armbru@redhat.com>; QEMU Devel Mailing List
> <qemu-devel@nongnu.org>
> Subject: Re: Question on memory commit during MR finalize()
> 
> Omg I don't know how I missed that, of course I'll ignore SIGUSR1 and retest!
> 
> ________________________________________
> From: Peter Xu <mailto:peterx@redhat.com>
> Sent: Monday, 19 July 2021, 16:58
> To: Thanos Makatos
> Cc: Paolo Bonzini; Markus Armbruster; QEMU Devel Mailing List; John Levon;
> John G Johnson
> Subject: Re: Question on memory commit during MR finalize()
> 
> 
> Hi, Thanos,
> 
> On Mon, Jul 19, 2021 at 02:38:52PM +0000, Thanos Makatos wrote:
> > I can trivially trigger an assertion with a build where I merged the recent
> vfio-user patches (https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__patchew.org_QEMU_cover.1626675354.git.elena.ufimtseva-
> 40oracle.com_&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJv
> tw6ogtti46atk736SI4vgsJiUKIyDE&m=LvALaULnrxZWlgXFcaxGAl95UIwq3a6LI8
> OnG_5r4XY&s=moFPVchYp27xozQcvvxG4nb4nC2QmMnqQ1Wmt4Z3dNE&e
> = ) to master and then merging the result into your xzpeter/memory-sanity
> branch, I've pushed the branch here:
> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__github.com_tmakatos_qemu_tree_memory-
> 2Dsanity&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6og
> tti46atk736SI4vgsJiUKIyDE&m=LvALaULnrxZWlgXFcaxGAl95UIwq3a6LI8OnG_
> 5r4XY&s=veyjdkkFkGSYNDZOuksB-kbHmdQaw9RYxyZp8Qo7nW4&e= . I
> explain the repro steps below in case you want to take a look:
> >
> > Build as follows:
> >
> > ./configure --prefix=/opt/qemu-xzpeter --target-list=x86_64-softmmu --
> enable-kvm  --enable-debug --enable-multiprocess && make -j `nproc` &&
> make install
> >
> > Then build and run the GPIO sample from libvfio-user
> (https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__github.com_nutanix_libvfio-
> 2Duser&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6ogt
> ti46atk736SI4vgsJiUKIyDE&m=LvALaULnrxZWlgXFcaxGAl95UIwq3a6LI8OnG_5
> r4XY&s=HYP5NmDMGuS13pdyV83x3HzyhGbE-oP1T8NLtu0d1U8&e= ):
> >
> > libvfio-user/build/dbg/samples/gpio-pci-idio-16 -v /var/run/vfio-user.sock
> >
> > And then run QEMU as follows:
> >
> > gdb --args /opt/qemu-xzpeter/bin/qemu-system-x86_64 -cpu host -
> enable-kvm -smp 4 -m 2G -object memory-backend-
> file,id=mem0,size=2G,mem-path=/dev/hugepages,share=on,prealloc=yes -
> numa node,memdev=mem0 -kernel bionic-server-cloudimg-amd64-vmlinuz-
> generic -initrd bionic-server-cloudimg-amd64-initrd-generic -append
> 'console=ttyS0 root=/dev/sda1 single' -hda bionic-server-cloudimg-amd64-
> 0.raw -nic user,model=virtio-net-pci -machine pc-q35-3.1 -device vfio-user-
> pci,socket=/var/run/vfio-user.sock -nographic
> >
> > I immediately get the following stack trace:
> >
> > Thread 5 "qemu-system-x86" received signal SIGUSR1, User defined signal
> 1.
> 
> This is SIGUSR1.  QEMU uses it for general vcpu ipis.
> 
> > [Switching to Thread 0x7fffe6e82700 (LWP 151973)]
> > __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
> > 103     ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or
> directory.
> > (gdb) bt
> > #0  0x00007ffff655d29c in __lll_lock_wait () at
> ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
> > #1  0x00007ffff6558642 in __pthread_mutex_cond_lock
> (mutex=mutex@entry=0x5555568bb280 <qemu_global_mutex>) at
> ../nptl/pthread_mutex_lock.c:80
> > #2  0x00007ffff6559ef8 in __pthread_cond_wait_common (abstime=0x0,
> mutex=0x5555568bb280 <qemu_global_mutex>, cond=0x555556cecc30) at
> pthread_cond_wait.c:645
> > #3  0x00007ffff6559ef8 in __pthread_cond_wait (cond=0x555556cecc30,
> mutex=0x5555568bb280 <qemu_global_mutex>) at
> pthread_cond_wait.c:655
> > #4  0x000055555604f717 in qemu_cond_wait_impl (cond=0x555556cecc30,
> mutex=0x5555568bb280 <qemu_global_mutex>, file=0x5555561ca869
> "../softmmu/cpus.c", line=514) at ../util/qemu-thread-posix.c:194
> > #5  0x0000555555d28a4a in qemu_cond_wait_iothread
> (cond=0x555556cecc30) at ../softmmu/cpus.c:514
> > #6  0x0000555555d28781 in qemu_wait_io_event (cpu=0x555556ce02c0) at
> ../softmmu/cpus.c:425
> > #7  0x0000555555e5da75 in kvm_vcpu_thread_fn (arg=0x555556ce02c0) at
> ../accel/kvm/kvm-accel-ops.c:54
> > #8  0x000055555604feed in qemu_thread_start (args=0x555556cecc70) at
> ../util/qemu-thread-posix.c:541
> > #9  0x00007ffff6553fa3 in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> > #10 0x00007ffff64824cf in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> 
> Would you please add below to your ~/.gdbinit script?
> 
>   handle SIGUSR1 nostop noprint
> 
> Or just run without gdb and wait it to crash with SIGABRT.
> 
> Thanks,
> 
> --
> Peter Xu

Sorry for the bad brain day, here's your stack trace:

qemu-system-x86_64: ../softmmu/cpus.c:72: qemu_mutex_unlock_iothread_prepare: 
Assertion `!memory_region_has_pending_transaction()' failed.

Thread 1 "qemu-system-x86" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff63c07bb in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff63ab535 in __GI_abort () at abort.c:79
#2  0x00007ffff63ab40f in __assert_fail_base
    (fmt=0x7ffff650dee0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
assertion=0x5555561ca880 "!memory_region_has_pending_transaction()", 
file=0x5555561ca869 "../softmmu/cpus.c", line=72, function=<optimized out>) at 
assert.c:92
#3  0x00007ffff63b9102 in __GI___assert_fail
    (assertion=0x5555561ca880 "!memory_region_has_pending_transaction()", 
file=0x5555561ca869 "../softmmu/cpus.c", line=72, function=0x5555561caa60 
<__PRETTY_FUNCTION__.37393> "qemu_mutex_unlock_iothread_prepare") at 
assert.c:101
#4  0x0000555555d27c20 in qemu_mutex_unlock_iothread_prepare () at 
../softmmu/cpus.c:72
#5  0x0000555555d289f6 in qemu_mutex_unlock_iothread () at ../softmmu/cpus.c:507
#6  0x0000555555dcb3d6 in vfio_user_send_recv (proxy=0x555557ac5560, 
msg=0x555557933d50, fds=0x0, rsize=40) at ../hw/vfio/user.c:88
#7  0x0000555555dcd30a in vfio_user_dma_unmap (proxy=0x555557ac5560, 
unmap=0x7fffffffd8d0, bitmap=0x0) at ../hw/vfio/user.c:796
#8  0x0000555555dabf5f in vfio_dma_unmap (container=0x555557a06fb0, 
iova=786432, size=2146697216, iotlb=0x0) at ../hw/vfio/common.c:501
#9  0x0000555555dae12c in vfio_listener_region_del (listener=0x555557a06fc0, 
section=0x7fffffffd9f0) at ../hw/vfio/common.c:1249
#10 0x0000555555d3d06d in address_space_update_topology_pass (as=0x5555568bbc80 
<address_space_memory>, old_view=0x555556d6cfb0, new_view=0x555556d6c8b0, 
adding=false) at ../softmmu/memory.c:960
#11 0x0000555555d3d62c in address_space_set_flatview (as=0x5555568bbc80 
<address_space_memory>) at ../softmmu/memory.c:1062
#12 0x0000555555d3d800 in memory_region_transaction_commit () at 
../softmmu/memory.c:1124
#13 0x0000555555b75a3e in mch_update_pam (mch=0x555556e80a40) at 
../hw/pci-host/q35.c:344
#14 0x0000555555b76068 in mch_update (mch=0x555556e80a40) at 
../hw/pci-host/q35.c:504
#15 0x0000555555b761d7 in mch_reset (qdev=0x555556e80a40) at 
../hw/pci-host/q35.c:561
#16 0x0000555555e93a95 in device_transitional_reset (obj=0x555556e80a40) at 
../hw/core/qdev.c:1028
#17 0x0000555555e956f8 in resettable_phase_hold (obj=0x555556e80a40, 
opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:182
#18 0x0000555555e8e07c in bus_reset_child_foreach (obj=0x555556ebce80, 
cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at 
../hw/core/bus.c:97
#19 0x0000555555e953ff in resettable_child_foreach (rc=0x555556a07ab0, 
obj=0x555556ebce80, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, 
type=RESET_TYPE_COLD) at ../hw/core/resettable.c:96
#20 0x0000555555e9567e in resettable_phase_hold (obj=0x555556ebce80, 
opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:173
#21 0x0000555555e920e0 in device_reset_child_foreach (obj=0x555556e802d0, 
cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at 
../hw/core/qdev.c:366
#22 0x0000555555e953ff in resettable_child_foreach (rc=0x555556ad2830, 
obj=0x555556e802d0, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, 
type=RESET_TYPE_COLD) at ../hw/core/resettable.c:96
#23 0x0000555555e9567e in resettable_phase_hold (obj=0x555556e802d0, 
opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:173
#24 0x0000555555e8e07c in bus_reset_child_foreach (obj=0x555556beaac0, 
cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at 
../hw/core/bus.c:97
#25 0x0000555555e953ff in resettable_child_foreach (rc=0x555556b1ca70, 
obj=0x555556beaac0, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, 
type=RESET_TYPE_COLD) at ../hw/core/resettable.c:96
#26 0x0000555555e9567e in resettable_phase_hold (obj=0x555556beaac0, 
opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:173
#27 0x0000555555e952b4 in resettable_assert_reset (obj=0x555556beaac0, 
type=RESET_TYPE_COLD) at ../hw/core/resettable.c:60
#28 0x0000555555e951f8 in resettable_reset (obj=0x555556beaac0, 
type=RESET_TYPE_COLD) at ../hw/core/resettable.c:45
#29 0x0000555555e95a37 in resettable_cold_reset_fn (opaque=0x555556beaac0) at 
../hw/core/resettable.c:269
#30 0x0000555555e93f40 in qemu_devices_reset () at ../hw/core/reset.c:69
#31 0x0000555555c9eb04 in pc_machine_reset (machine=0x555556a4d9e0) at 
../hw/i386/pc.c:1654
#32 0x0000555555d381fb in qemu_system_reset (reason=SHUTDOWN_CAUSE_NONE) at 
../softmmu/runstate.c:443
#33 0x0000555555a787f2 in qdev_machine_creation_done () at 
../hw/core/machine.c:1330
#34 0x0000555555d4e09c in qemu_machine_creation_done () at ../softmmu/vl.c:2650
#35 0x0000555555d4e16b in qmp_x_exit_preconfig (errp=0x5555568db1a0 
<error_fatal>) at ../softmmu/vl.c:2673
#36 0x0000555555d506be in qemu_init (argc=31, argv=0x7fffffffe268, 
envp=0x7fffffffe368) at ../softmmu/vl.c:3692
#37 0x0000555555945cad in main (argc=31, argv=0x7fffffffe268, 
envp=0x7fffffffe368) at ../softmmu/main.c:49

This is where the vfio-user client in QEMU tells the vfio-user server (the GPIO 
device) that this particular memory region is not available for DMA. There are 
3 vfio_dma_map() operations before this happens and this seems to be the very 
first vfio_dma_unmap() operation.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]