qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH] nbd/server: attach client channel to the export


From: Eric Blake
Subject: Re: [Qemu-block] [PATCH] nbd/server: attach client channel to the export's AioContext
Date: Mon, 16 Sep 2019 16:11:09 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 9/12/19 1:37 AM, Sergio Lopez wrote:

>> I tried to test this patch, but even with it applied, I still got an
>> aio-context crasher by attempting an nbd-server-start, nbd-server-add,
>> nbd-server-stop (intentionally skipping the nbd-server-remove step) on a
>> domain using iothreads, with a backtrace of:
>>
>> #0  0x00007ff09d070e35 in raise () from target:/lib64/libc.so.6
>> #1  0x00007ff09d05b895 in abort () from target:/lib64/libc.so.6
>> #2  0x000055dd03b9ab86 in error_exit (err=1, msg=0x55dd03d59fb0
>> <__func__.15769> "qemu_mutex_unlock_impl")
>>     at util/qemu-thread-posix.c:36
>> #3  0x000055dd03b9adcf in qemu_mutex_unlock_impl (mutex=0x55dd062d5090,
>> file=0x55dd03d59041 "util/async.c",
>>     line=523) at util/qemu-thread-posix.c:96
>> #4  0x000055dd03b93433 in aio_context_release (ctx=0x55dd062d5030) at
>> util/async.c:523

>> #14 0x000055dd03748845 in qmp_nbd_server_stop (errp=0x7ffcdf3cb4e8) at
>> blockdev-nbd.c:233
>> ...

Sorry for truncating the initial stackdump report. The rest of the trace
(it is definitely in the main loop):

#15 0x0000560be491c910 in qmp_marshal_nbd_server_stop
(args=0x560be54c4d00, ret=0x7ffdd832de38,
    errp=0x7ffdd832de30) at qapi/qapi-commands-block.c:318
#16 0x0000560be4a7a306 in do_qmp_dispatch (cmds=0x560be50dc1f0
<qmp_commands>, request=0x7fbcac009af0,
    allow_oob=false, errp=0x7ffdd832ded8) at qapi/qmp-dispatch.c:131
#17 0x0000560be4a7a507 in qmp_dispatch (cmds=0x560be50dc1f0
<qmp_commands>, request=0x7fbcac009af0,
    allow_oob=false) at qapi/qmp-dispatch.c:174
#18 0x0000560be48edd81 in monitor_qmp_dispatch (mon=0x560be55d6670,
req=0x7fbcac009af0) at monitor/qmp.c:120
#19 0x0000560be48ee116 in monitor_qmp_bh_dispatcher (data=0x0) at
monitor/qmp.c:209
#20 0x0000560be4ad16a2 in aio_bh_call (bh=0x560be53dbe90) at util/async.c:89
#21 0x0000560be4ad173a in aio_bh_poll (ctx=0x560be53daba0) at
util/async.c:117
#22 0x0000560be4ad6514 in aio_dispatch (ctx=0x560be53daba0) at
util/aio-posix.c:459
#23 0x0000560be4ad1ad3 in aio_ctx_dispatch (source=0x560be53daba0,
callback=0x0, user_data=0x0) at util/async.c:260
#24 0x00007fbcd7083ecd in g_main_context_dispatch () from
target:/lib64/libglib-2.0.so.0
#25 0x0000560be4ad4e47 in glib_pollfds_poll () at util/main-loop.c:218
#26 0x0000560be4ad4ec1 in os_host_main_loop_wait (timeout=1000000000) at
util/main-loop.c:241
#27 0x0000560be4ad4fc6 in main_loop_wait (nonblocking=0) at
util/main-loop.c:517
--Type <RET> for more, q to quit, c to continue without paging--
#28 0x0000560be4691266 in main_loop () at vl.c:1806
#29 0x0000560be46988a9 in main (argc=112, argv=0x7ffdd832e4e8,
envp=0x7ffdd832e870) at vl.c:4488


>>
>> Does that sound familiar to what you were seeing?  Does it mean we
>> missed another spot where the context is set incorrectly?
> 
> It looks like it was trying to release the AioContext while it was still
> held by some other thread. Is this stacktrace from the main thread or an
> iothread? What was the other one doing?

Kevin had some ideas on what it might be; I'm playing with obtaining the
context in the spots he pointed out.

> 
>> I'm happy to work with you on IRC for more real-time debugging of this
>> (I'm woefully behind on understanding how aio contexts are supposed to
>> work).
> 
> I must be missing some step, because I can't reproduce this one
> here. I've tried both with an idle NDB server and one with a client
> generating I/O. Is it reproducible 100% of them time?

Yes, with iothreads.  I took some time today to boil it down to
something that does not require libvirt:

$ file myfile
myfile: QEMU QCOW2 Image (v3), 104857600 bytes
$ qemu-img create -f qcow2 -o backing_file=myfile,backing_fmt=qcow2  \
 myfile.wrap
Formatting 'myfile.wrap', fmt=qcow2 size=104857600 backing_file=myfile
backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults \
  -name tmp,debug-threads=on -machine pc-q35-3.1,accel=kvm \
  -object iothread,id=iothread1 \
  -drive
file=myfile,format=qcow2,if=none,id=drive-virtio-disk0,node-name=n \
  -device
virtio-blk-pci,iothread=iothread1,drive=drive-virtio-disk0,id=virtio-disk0 \
  -qmp stdio -nographic
{'execute':'qmp_capabilities'}
{'execute':'nbd-server-start','arguments':{'addr':{'type':'inet',
  'data':{'host':'localhost','port':'10809'}}}}
{'execute':'blockdev-add','arguments':{'driver':'qcow2',
 'node-name':'t','file'{'driver':'file',
  'filename':'myfile.wrap'},'backing':'n'}}
{'execute':'blockdev-backup','arguments':{'device':'n',
 'target':'t','sync':'none','job-id':'b'}}
{'execute':'nbd-server-add','arguments':{'device':'t','name':'t'}}
{'execute':'nbd-server-remove','arguments':{'name':'t'}}
Aborted (core dumped)

I'm now playing with Kevin's ideas of grabbing the aiocontext around nbd
unref.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]