qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: acceptance-system-fedora failures


From: Philippe Mathieu-Daudé
Subject: Re: acceptance-system-fedora failures
Date: Thu, 8 Oct 2020 12:26:27 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0

On 10/7/20 3:11 PM, Pavel Dovgalyuk wrote:
> On 07.10.2020 15:49, Philippe Mathieu-Daudé wrote:
>> On 10/7/20 2:20 PM, Pavel Dovgalyuk wrote:
>>> On 07.10.2020 14:22, Alex Bennée wrote:
>>>>
>>>> Philippe Mathieu-Daudé <philmd@redhat.com> writes:
>>>>
>>>>> On 10/7/20 10:51 AM, Pavel Dovgalyuk wrote:
>>>>>> On 07.10.2020 11:23, Thomas Huth wrote:
>>>>>>> On 07/10/2020 09.13, Philippe Mathieu-Daudé wrote:
>>>>>>> Thanks, that was helpful. ... and the winner is:
>>>>>>>
>>>>>>>        commit   55adb3c45620c31f29978f209e2a44a08d34e2da
>>>>>>>        Author:  John Snow <jsnow@redhat.com>
>>>>>>>        Date:    Fri Jul 24 01:23:00 2020 -0400
>>>>>>>        Subject: ide: cancel pending callbacks on SRST
>>>>>>>
>>>>>>> ... starting with this commit, the tests starts failing. John, any
>>>>>>> idea what
>>>>>>> might be causing this?
>>>>>>
>>>>>> This patch includes the following lines:
>>>>>>
>>>>>> +        aio_bh_schedule_oneshot(qemu_get_aio_context(),
>>>>>> +                                ide_bus_perform_srst, bus);
>>>>>>
>>>>>> replay_bh_schedule_oneshot_event should be used instead of this
>>>>>> function, because it synchronizes non-deterministic BHs.
>>>>>
>>>>> Why do we have 2 different functions? BH are already complex
>>>>> enough, and we need to also think about the replay API...
>>>>>
>>>>> What about the other cases such vhost-user (blk/net), virtio-blk?
>>>>
>>>> This does seem like something that should be wrapped up inside
>>>> aio_bh_schedule_oneshot itself or maybe we need a
>>>> aio_bh_schedule_transaction_oneshot to distinguish it from the other
>>>> uses the function has.
>>>>
>>>
>>> Maybe there should be two functions:
>>> - one for the guest modification
>>
>> aio_bh_schedule_oneshot_deterministic()?
>>
>>> - one for internal qemu things
>>
>> Not sure why there is a difference, BH are used to
>> avoid delaying the guest, so there always something
>> related to "guest modification".
> 
> Not exactly. At least there is one non-related-to-guest case
> in monitor_init_qmp:
>         /*
>          * We can't call qemu_chr_fe_set_handlers() directly here
>          * since chardev might be running in the monitor I/O
>          * thread.  Schedule a bottom half.
>          */
>         aio_bh_schedule_oneshot(iothread_get_aio_context(mon_iothread),
>                                 monitor_qmp_setup_handlers_bh, mon);

I don't understand the documentation in docs/devel/replay.txt:

---
Bottom halves
=============

Bottom half callbacks, that affect the guest state, should be invoked
through
replay_bh_schedule_event or replay_bh_schedule_oneshot_event functions.
Their invocations are saved in record mode and synchronized with the
existing
log in replay mode.
---

But then it is only used in block drivers, which are not
related to guest state:

$ git grep replay_bh_schedule_oneshot_event
block/block-backend.c:1385:
replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
block/block-backend.c:1450:
replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
block/io.c:371:
replay_bh_schedule_oneshot_event(bdrv_get_aio_context(bs),
block/iscsi.c:286:
replay_bh_schedule_oneshot_event(iTask->iscsilun->aio_context,
block/nfs.c:262:
replay_bh_schedule_oneshot_event(task->client->aio_context,
block/null.c:183:
replay_bh_schedule_oneshot_event(bdrv_get_aio_context(bs),
block/nvme.c:323:        replay_bh_schedule_oneshot_event(q->aio_context,
block/nvme.c:1075:    replay_bh_schedule_oneshot_event(data->ctx,
nvme_rw_cb_bh, data);
block/rbd.c:865:
replay_bh_schedule_oneshot_event(bdrv_get_aio_context(acb->common.bs),
docs/devel/replay.txt:25:replay_bh_schedule_event or
replay_bh_schedule_oneshot_event functions.
include/sysemu/replay.h:178:void
replay_bh_schedule_oneshot_event(AioContext *ctx,
replay/replay-events.c:141:void
replay_bh_schedule_oneshot_event(AioContext *ctx,
stubs/replay-user.c:5:void replay_bh_schedule_oneshot_event(AioContext *ctx,

Is replay_bh_schedule_oneshot_event ever used by replay?
Maybe we can remove it and use aio_bh_schedule_oneshot()
in place?

Else the documentation need to be clarified please.

> 
> 
>>
>>>
>>> The first one may be implemented though the rr+second one.
>>> Maybe replay_ prefix is confusing and the whole BH interface should look
>>> like that?
>>
>> Yes, but it would be safer/clearer if we don't need to use
>> a replay_ API.
>>
>> Can we embed these functions?
>>
>> - replay_bh_schedule_event
>> - replay_bh_schedule_oneshot_event
>>
>> If compiled without rr, events_enabled=false and
>> compiler can optimize:
>>
>> -- >8 --
>> diff --git a/util/async.c b/util/async.c
>> index f758354c6a..376b6d4e27 100644
>> --- a/util/async.c
>> +++ b/util/async.c
>> @@ -109,14 +109,19 @@ static QEMUBH *aio_bh_dequeue(BHList *head,
>> unsigned *flags)
>>
>>   void aio_bh_schedule_oneshot(AioContext *ctx, QEMUBHFunc *cb, void
>> *opaque)
>>   {
>> -    QEMUBH *bh;
>> -    bh = g_new(QEMUBH, 1);
>> -    *bh = (QEMUBH){
>> -        .ctx = ctx,
>> -        .cb = cb,
>> -        .opaque = opaque,
>> -    };
>> -    aio_bh_enqueue(bh, BH_SCHEDULED | BH_ONESHOT);
>> +    if (events_enabled) {
>> +        replay_add_event(REPLAY_ASYNC_EVENT_BH_ONESHOT, cb,
>> +                         opaque, replay_get_current_icount());
>> +    } else {
>> +        QEMUBH *bh;
>> +        bh = g_new(QEMUBH, 1);
>> +        *bh = (QEMUBH){
>> +            .ctx = ctx,
>> +            .cb = cb,
>> +            .opaque = opaque,
>> +        };
>> +        aio_bh_enqueue(bh, BH_SCHEDULED | BH_ONESHOT);
>> +    }
>>   }
>>
>>   QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
>> @@ -178,7 +183,12 @@ void qemu_bh_schedule_idle(QEMUBH *bh)
>>
>>   void qemu_bh_schedule(QEMUBH *bh)
> 
> qemu_bh_schedule is even worse.
> Many modules use it (e.g., loadvm_postcopy_handle_run), and there is no
> need to use replay version there. In such cases QEMU will halt if trying
> to call replay_bh_schedule_event.
> 
>>   {
>> -    aio_bh_enqueue(bh, BH_SCHEDULED);
>> +    if (events_enabled) {
>> +        replay_add_event(REPLAY_ASYNC_EVENT_BH, bh, NULL,
>> +                         replay_get_current_icount());
>> +    } else {
>> +        aio_bh_enqueue(bh, BH_SCHEDULED);
>> +    }
>>   }
>>
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]