qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cl


From: Rao, Lei
Subject: RE: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case
Date: Thu, 2 Dec 2021 02:49:23 +0000


-----Original Message-----
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> 
Sent: Wednesday, December 1, 2021 10:27 PM
To: Rao, Lei <lei.rao@intel.com>; Daniel P. Berrangé <berrange@redhat.com>
Cc: Zhang, Chen <chen.zhang@intel.com>; eblake@redhat.com; kwolf@redhat.com; 
hreitz@redhat.com; qemu-block@nongnu.org; qemu-devel@nongnu.org
Subject: Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits 
cleanly in some corner case

01.12.2021 12:48, Rao, Lei wrote:
> 
> 
> -----Original Message-----
> From: Daniel P. Berrangé <berrange@redhat.com>
> Sent: Wednesday, December 1, 2021 5:11 PM
> To: Rao, Lei <lei.rao@intel.com>
> Cc: Zhang, Chen <chen.zhang@intel.com>; eblake@redhat.com; 
> vsementsov@virtuozzo.com; kwolf@redhat.com; hreitz@redhat.com; 
> qemu-block@nongnu.org; qemu-devel@nongnu.org
> Subject: Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure 
> QIO exits cleanly in some corner case
> 
> On Wed, Dec 01, 2021 at 03:54:27PM +0800, Rao, Lei wrote:
>>      We found that the QIO channel coroutine could not be awakened in some 
>> corner cases during our stress test for COLO.
>>      The patch fixes as follow:
>>          #0  0x00007fad72e24bf6 in __ppoll (fds=0x5563d75861f0, nfds=1, 
>> timeout=<optimized out>, sigmask=0x0) at 
>> ../sysdeps/unix/sysv/linux/ppoll.c:44
>>          #1  0x00005563d6977c68 in qemu_poll_ns (fds=0x5563d75861f0, nfds=1, 
>> timeout=599999697630) at util/qemu-timer.c:348
>>          #2  0x00005563d697ac44 in aio_poll (ctx=0x5563d755dfa0, 
>> blocking=true) at util/aio-posix.c:669
>>          #3  0x00005563d68ba24f in bdrv_do_drained_begin (bs=0x5563d75ea0c0, 
>> recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
>> block/io.c:432
>>          #4  0x00005563d68ba338 in bdrv_drained_begin (bs=0x5563d75ea0c0) at 
>> block/io.c:438
>>          #5  0x00005563d689c6d2 in quorum_del_child (bs=0x5563d75ea0c0, 
>> child=0x5563d7908600, errp=0x7fff3cf5b960) at block/quorum.c:1063
>>          #6  0x00005563d684328f in bdrv_del_child (parent_bs=0x5563d75ea0c0, 
>> child=0x5563d7908600, errp=0x7fff3cf5b960) at block.c:6568
>>          #7  0x00005563d658499e in qmp_x_blockdev_change 
>> (parent=0x5563d80ec4c0 "colo-disk0", has_child=true, child=0x5563d7577d30 
>> "children.1", has_node=false, node=0x0,errp=0x7fff3cf5b960) at 
>> blockdev.c:4494
>>          #8  0x00005563d67e8b4e in qmp_marshal_x_blockdev_change 
>> (args=0x7fad6400ada0, ret=0x7fff3cf5b9f8, errp=0x7fff3cf5b9f0) at 
>> qapi/qapi-commands-block-core.c:1538
>>          #9  0x00005563d691cd9e in do_qmp_dispatch (cmds=0x5563d719a4f0 
>> <qmp_commands>, request=0x7fad64009d80, allow_oob=true, errp=0x7fff3cf5ba98)
>>              at qapi/qmp-dispatch.c:132
>>          #10 0x00005563d691cfab in qmp_dispatch (cmds=0x5563d719a4f0 
>> <qmp_commands>, request=0x7fad64009d80, allow_oob=true) at 
>> qapi/qmp-dispatch.c:175
>>          #11 0x00005563d67b7787 in monitor_qmp_dispatch (mon=0x5563d7605d40, 
>> req=0x7fad64009d80) at monitor/qmp.c:145
>>          #12 0x00005563d67b7b24 in monitor_qmp_bh_dispatcher (data=0x0) at 
>> monitor/qmp.c:234
>>          #13 0x00005563d69754c2 in aio_bh_call (bh=0x5563d7473230) at 
>> util/async.c:89
>>          #14 0x00005563d697555e in aio_bh_poll (ctx=0x5563d7471da0) at 
>> util/async.c:117
>>          #15 0x00005563d697a423 in aio_dispatch (ctx=0x5563d7471da0) at 
>> util/aio-posix.c:459
>>          #16 0x00005563d6975917 in aio_ctx_dispatch (source=0x5563d7471da0, 
>> callback=0x0, user_data=0x0) at util/async.c:260
>>          #17 0x00007fad730e1fbd in g_main_context_dispatch () from 
>> /lib/x86_64-linux-gnu/libglib-2.0.so.0
>>          #18 0x00005563d6978cda in glib_pollfds_poll () at 
>> util/main-loop.c:219
>>          #19 0x00005563d6978d58 in os_host_main_loop_wait (timeout=977650) 
>> at util/main-loop.c:242
>>          #20 0x00005563d6978e69 in main_loop_wait (nonblocking=0) at 
>> util/main-loop.c:518
>>          #21 0x00005563d658f5ed in main_loop () at vl.c:1814
>>          #22 0x00005563d6596bb7 in main (argc=61, 
>> argv=0x7fff3cf5c0c8,
>> envp=0x7fff3cf5c2b8) at vl.c:450
>>
>>      From the call trace, we can see that the QEMU main thread is waiting 
>> for the in_flight return to zero. But the in_filght never reaches 0.
>>      After analysis, the root cause is that the coroutine of NBD was not 
>> awakened. Although this bug was found in the COLO test, I think this is a
>>      universal problem in the QIO module. This issue also affects other 
>> modules depending on QIO such as NBD. We dump the following data:
>>      $2 = {
>>        in_flight = 2,
>>        state = NBD_CLIENT_QUIT,
>>        connect_status = 0,
>>        connect_err = 0x0,
>>        wait_in_flight = false,
>>        requests = {{
>>            coroutine = 0x0,
>>            offset = 273077071872,
>>            receiving = false,
>>          }, {
>>            coroutine = 0x7f1164571df0,
>>            offset = 498792529920,
>>            receiving = false,
>>          }, {
>>            coroutine = 0x0,
>>            offset = 500663590912,
>>            receiving = false,
>>          }, {
>>            ...
>>          } },
>>        reply = {
>>          simple = {
>>            magic = 1732535960,
>>            error = 0,
>>            handle = 0
>>          },
>>          structured = {
>>            magic = 1732535960,
>>            flags = 0,
>>            type = 0,
>>            handle = 0,
>>            length = 0
>>          },
>>          {
>>            magic = 1732535960,
>>            _skip = 0,
>>            handle = 0
>>          }
>>        },
>>        bs = 0x7f11640e2140,
>>        reconnect_delay = 0,
>>        saddr = 0x7f11640e1f80,
>>        export = 0x7f11640dc560 "parent0",
>>      }
>>      From the data, we can see the coroutine of NBD does not exit normally 
>> when killing the NBD server(we kill the Secondary VM in the COLO stress 
>> test).
>>      Then it will not execute in_flight--. So, the QEMU main thread will 
>> hang here. Further analysis, I found the coroutine of NBD will yield
>>      in nbd_send_request() or qio_channel_write_all() in 
>> nbd_co_send_request. By the way, the yield is due to the kernel return 
>> EAGAIN(under the stress test).
>>      However, NBD didn't know it would yield here. So, the 
>> nbd_recv_coroutines_wake won't wake it up because req->receiving is false. 
>> if the NBD server
>>      is terminated abnormally at the same time. The G_IO_OUT will be 
>> invalid, the coroutine will never be awakened. In addition, the s->ioc will 
>> be released
>>      immediately. if we force to wake up the coroutine of NBD, access to the 
>> ioc->xxx will cause segment fault. Finally, I add a state named force_quit to
>>      the QIOChannel to ensure that QIO will exit immediately. And I add the 
>> function of qio_channel_coroutines_wake to wake it up.

If qio_channel_shutdown() can't kill request that is in qio_channel_write_all() 
- it's not a reponsibility of nbd driver, qio channel layer should take care of 
this..

Or, you probably need some keep-alive settings set up.

>>
>> Signed-off-by: Lei Rao <lei.rao@intel.com>
>> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
>> ---
>>   block/nbd.c          |  5 +++++
>>   include/io/channel.h | 19 +++++++++++++++++++
>>   io/channel.c         | 22 ++++++++++++++++++++++
>>   3 files changed, 46 insertions(+)
>>
>> diff --git a/block/nbd.c b/block/nbd.c index 5ef462db1b..5ee4eaaf57
>> 100644
>> --- a/block/nbd.c
>> +++ b/block/nbd.c
>> @@ -208,6 +208,8 @@ static void nbd_teardown_connection(BlockDriverState *bs)
>>       assert(!s->in_flight);
>>   
>>       if (s->ioc) {
>> +        qio_channel_set_force_quit(s->ioc, true);
>> +        qio_channel_coroutines_wake(s->ioc);
>>           qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, 
>> NULL);
> 
> Calling shutdown here should already be causing the 
> qio_chanel_readv_all to wakeup and break out of its
> poll() loop. We shouldn't need to add a second way to break out of the poll().
> 
> Calling shutdown can wake up the coroutines which is polling. But I 
> think it's not enough. I tried to forcibly wake up the NBD coroutine, 
> It may cause segment fault. The root cause is that it will continue to access 
> ioc->xxx in qio_channel_yield(), but the ioc has been released and set it 
> NULL such as in nbd_co_do_establish_connection(); I think call shutdown will 
> have the same result. So, I add the force_quit, once set it true, it will 
> immediately exit without accessing IOC.
> 

What do you mean by "the NBD coroutine" and by "forcibly wake up"?

In recent Qemu there is no specific NBD coroutine. Only request coroutines 
should exist.

Forcibly waking up any coroutine is generally unsafe in Qemu, most of the code 
is not prepared for this, crash is normal result for such try.

Also, there is good mechanism to debug coroutines in gdb:

source scripts/qemu-gdb.py
qemu coroutine <coroutine pointer>

- it will dump stack trace of a coroutine, it may help.

Also, to find coroutines look at bs->tracked_requests list, all requests of bs 
are in the list with coroutine pointers in .co field.

I am sorry for the unclear description. The NBD coroutine means the request 
coroutines. About "the forcibly wake up", I just set the receiving to true 
before qio_channel_writev_all() in nbd_co_send_request() to ensure that the 
request coroutines can be awakened by nbd_recv_coroutine_wake_one().
But that's just a test. Only waking up the request coroutine or shutdown the 
QIO is not enough because there will be a synchronization issue. For example, 
When the NBD request coroutine is awakened in qio_channel_writev_full_all(), it 
will continue to access the ioc->xxx, but the ioc has been set to NULL in 
nbd_co_do_establish_connection(); it will cause segment fault, and I have been 
verified this. So how about don't yield in QIO, when the request can't be sent, 
QIO can return to the caller directly?  Does the caller decide whether to yield 
or not?

> Thanks
> Lei
> 
>>           yank_unregister_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name),
>>                                    nbd_yank, s->bs); @@ -315,6 +317,7 
>> @@ int coroutine_fn nbd_co_do_establish_connection(BlockDriverState
>> *bs,
>>   
>>       /* successfully connected */
>>       s->state = NBD_CLIENT_CONNECTED;
>> +    qio_channel_set_force_quit(s->ioc, false);
>>       qemu_co_queue_restart_all(&s->free_sema);
>>   
>>       return 0;
>> @@ -345,6 +348,8 @@ static coroutine_fn void 
>> nbd_reconnect_attempt(BDRVNBDState *s)
>>   
>>       /* Finalize previous connection if any */
>>       if (s->ioc) {
>> +        qio_channel_set_force_quit(s->ioc, true);
>> +        qio_channel_coroutines_wake(s->ioc);
> 
> Isn't this code path just missing a qio_channel_shutdown call or a 
> qio_channel_close call to make the socket trigger wakeup from poll.
> 
> I don't think it can solve the bug even if it is added, the reason is as 
> above.
> Thanks,
> Lei
> 
>>           qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc));
>>           yank_unregister_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name),
>>                                    nbd_yank, s->bs); diff --git 
>> a/include/io/channel.h b/include/io/channel.h index
>> 88988979f8..bc5af8cdd6 100644
>> --- a/include/io/channel.h
>> +++ b/include/io/channel.h
>> @@ -78,6 +78,7 @@ struct QIOChannel {
>>       AioContext *ctx;
>>       Coroutine *read_coroutine;
>>       Coroutine *write_coroutine;
>> +    bool force_quit;
>>   #ifdef _WIN32
>>       HANDLE event; /* For use with GSource on Win32 */  #endif @@
>> -484,6 +485,24 @@ int qio_channel_set_blocking(QIOChannel *ioc,
>>                                bool enabled,
>>                                Error **errp);
>>   
>> +/**
>> + * qio_channel_force_quit:
>> + * @ioc: the channel object
>> + * @quit: the new flag state
>> + *
>> + * Set the new flag state
>> + */
>> +void qio_channel_set_force_quit(QIOChannel *ioc, bool quit);
>> +
>> +/**
>> + * qio_channel_coroutines_wake:
>> + * @ioc: the channel object
>> + *
>> + * Wake up the coroutines to ensure that they will exit normally
>> + * when the server terminated abnormally  */ void 
>> +qio_channel_coroutines_wake(QIOChannel *ioc);
>> +
>>   /**
>>    * qio_channel_close:
>>    * @ioc: the channel object
>> diff --git a/io/channel.c b/io/channel.c index e8b019dc36..bc1a9e055c
>> 100644
>> --- a/io/channel.c
>> +++ b/io/channel.c
>> @@ -136,6 +136,9 @@ int qio_channel_readv_full_all_eof(QIOChannel *ioc,
>>           if (len == QIO_CHANNEL_ERR_BLOCK) {
>>               if (qemu_in_coroutine()) {
>>                   qio_channel_yield(ioc, G_IO_IN);
>> +                if (ioc->force_quit) {
>> +                    goto cleanup;
>> +                }
>>               } else {
>>                   qio_channel_wait(ioc, G_IO_IN);
>>               }
>> @@ -242,6 +245,9 @@ int qio_channel_writev_full_all(QIOChannel *ioc,
>>           if (len == QIO_CHANNEL_ERR_BLOCK) {
>>               if (qemu_in_coroutine()) {
>>                   qio_channel_yield(ioc, G_IO_OUT);
>> +                if (ioc->force_quit) {
>> +                    goto cleanup;
>> +                }
>>               } else {
>>                   qio_channel_wait(ioc, G_IO_OUT);
>>               }
>> @@ -543,6 +549,9 @@ void coroutine_fn qio_channel_yield(QIOChannel *ioc,
>>       }
>>       qio_channel_set_aio_fd_handlers(ioc);
>>       qemu_coroutine_yield();
>> +    if (ioc->force_quit) {
>> +        return;
>> +    }
>>   
>>       /* Allow interrupting the operation by reentering the coroutine other 
>> than
>>        * through the aio_fd_handlers. */ @@ -555,6 +564,19 @@ void 
>> coroutine_fn qio_channel_yield(QIOChannel *ioc,
>>       }
>>   }
>>   
>> +void qio_channel_set_force_quit(QIOChannel *ioc, bool quit) {
>> +    ioc->force_quit = quit;
>> +}
>> +
>> +void qio_channel_coroutines_wake(QIOChannel *ioc) {
>> +    if (ioc->read_coroutine) {
>> +        qio_channel_restart_read(ioc);
>> +    } else if (ioc->write_coroutine) {
>> +        qio_channel_restart_write(ioc);
>> +    }
>> +}
> 
> None of this should be needed AFAICT, as the poll/coroutine code shuld 
> already break out of poll if the socket is closed, or is marked shutdown.
> 
> 
> Regards,
> Daniel
> 


--
Best regards,
Vladimir

reply via email to

[Prev in Thread] Current Thread [Next in Thread]