[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines t
From: |
Peter Lieven |
Subject: |
Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete |
Date: |
Mon, 24 Apr 2017 20:16:56 +0200 |
> Am 24.04.2017 um 18:27 schrieb Anton Nefedov <address@hidden>:
>
>> On 04/21/2017 03:37 PM, Peter Lieven wrote:
>>> Am 21.04.2017 um 14:19 schrieb Anton Nefedov:
>>>> On 04/21/2017 01:44 PM, Peter Lieven wrote:
>>>>> Am 21.04.2017 um 12:04 schrieb Anton Nefedov:
>>>>> On error path (like i/o error in one of the coroutines), it's required to
>>>>> - wait for coroutines completion before cleaning the common structures
>>>>> - reenter dependent coroutines so they ever finish
>>>>>
>>>>> Introduced in 2d9187bc65.
>>>>>
>>>>> Signed-off-by: Anton Nefedov <address@hidden>
>>>>> ---
>>>>> [..]
>>>>>
>>>>
>>>>
>>>> And what if we error out in the read path? Wouldn't be something like this
>>>> easier?
>>>>
>>>>
>>>> diff --git a/qemu-img.c b/qemu-img.c
>>>> index 22f559a..4ff1085 100644
>>>> --- a/qemu-img.c
>>>> +++ b/qemu-img.c
>>>> @@ -1903,6 +1903,16 @@ static int convert_do_copy(ImgConvertState *s)
>>>> main_loop_wait(false);
>>>> }
>>>>
>>>> + /* on error path we need to enter all coroutines that are still
>>>> + * running before cleaning up common structures */
>>>> + if (s->ret) {
>>>> + for (i = 0; i < s->num_coroutines; i++) {
>>>> + if (s->co[i]) {
>>>> + qemu_coroutine_enter(s->co[i]);
>>>> + }
>>>> + }
>>>> + }
>>>> +
>>>> if (s->compressed && !s->ret) {
>>>> /* signal EOF to align */
>>>> ret = blk_pwrite_compressed(s->target, 0, NULL, 0);
>>>>
>>>>
>>>> Peter
>>>>
>>>
>>> seemed a bit too daring to me to re-enter every coroutine potentially
>>> including the ones that yielded waiting for I/O completion.
>>> If that's ok - that is for sure easier :)
>>
>> I think we should enter every coroutine that is still running and have it
>> terminate correctly. It was my mistake that I have not
>> done this in the original patch. Can you check if the above fixes your test
>> cases that triggered the bug?
>>
>
> hi, sorry I'm late with the answer
>
> this segfaults in bdrv_close(). Looks like it tries to finish some i/o which
> coroutine we have already entered and terminated?
>
> (gdb) run
> Starting program: /vz/anefedov/qemu-build/us/./qemu-img convert -O qcow2
> ./harddisk.hdd.c ./harddisk.hdd
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [New Thread 0x7fffeac2d700 (LWP 436020)]
> [New Thread 0x7fffe4ed6700 (LWP 436021)]
> qemu-img: error while reading sector 20480: Input/output error
> qemu-img: error while writing sector 19712: Operation now in progress
>
> Program received signal SIGSEGV, Segmentation fault.
> aio_co_wake (co=0x0) at /mnt/code/us-qemu/util/async.c:454
> 454 ctx = atomic_read(&co->ctx);
> (gdb) bt
> #0 aio_co_wake (co=0x0) at /mnt/code/us-qemu/util/async.c:454
> /* [Anton]: thread_pool_co_cb () here */
> #1 0x0000555555634629 in thread_pool_completion_bh (opaque=0x555555cfe020)
> at /mnt/code/us-qemu/util/thread-pool.c:189
> #2 0x0000555555633b31 in aio_bh_call (bh=0x555555cfe0f0) at
> /mnt/code/us-qemu/util/async.c:90
> #3 aio_bh_poll (address@hidden) at /mnt/code/us-qemu/util/async.c:118
> #4 0x0000555555636f14 in aio_poll (address@hidden, blocking=<optimized out>)
> at /mnt/code/us-qemu/util/aio-posix.c:682
> #5 0x00005555555c52d4 in bdrv_drain_recurse (address@hidden) at
> /mnt/code/us-qemu/block/io.c:164
> #6 0x00005555555c5aed in bdrv_drained_begin (address@hidden) at
> /mnt/code/us-qemu/block/io.c:248
> #7 0x0000555555581443 in bdrv_close (bs=0x555555d22560) at
> /mnt/code/us-qemu/block.c:2909
> #8 bdrv_delete (bs=0x555555d22560) at /mnt/code/us-qemu/block.c:3100
> #9 bdrv_unref (bs=0x555555d22560) at /mnt/code/us-qemu/block.c:4087
> #10 0x00005555555baf44 in blk_remove_bs (address@hidden) at
> /mnt/code/us-qemu/block/block-backend.c:552
> #11 0x00005555555bb173 in blk_delete (blk=0x555555d22380) at
> /mnt/code/us-qemu/block/block-backend.c:238
> #12 blk_unref (address@hidden) at /mnt/code/us-qemu/block/block-backend.c:282
> #13 0x000055555557a22c in img_convert (argc=<optimized out>, argv=<optimized
> out>) at /mnt/code/us-qemu/qemu-img.c:2359
> #14 0x0000555555574189 in main (argc=5, argv=0x7fffffffe4a0) at
> /mnt/code/us-qemu/qemu-img.c:4464
>
>
>> Peter
>>
>
> /Anton
>
it seems that this is a bit tricky, can you share how your test case works?
thanks,
peter
- [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete, Anton Nefedov, 2017/04/21
- Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete, Peter Lieven, 2017/04/21
- Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete, Anton Nefedov, 2017/04/21
- Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete, Peter Lieven, 2017/04/21
- Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete, Anton Nefedov, 2017/04/24
- Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete,
Peter Lieven <=
- Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete, Anton Nefedov, 2017/04/24
- Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete, Peter Lieven, 2017/04/25
- Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete, Peter Lieven, 2017/04/25
- Re: [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete, Anton Nefedov, 2017/04/25