Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained_begin()

From:	Fiona Ebner
Subject:	Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained_begin()
Date:	Thu, 7 Dec 2023 16:22:11 +0100
User-agent:	Mozilla Thunderbird

Am 03.11.23 um 14:12 schrieb Fiona Ebner:
> Hi,
> 
> I ran into a strange issue where guest IO would get completely stuck
> during certain block jobs a while ago and finally managed to find a
> small reproducer [0]. I'm using a VM with virtio-blk-pci (or
> virtio-scsi-pci) with an iothread and running
> 
> fio --name=file --size=100M --direct=1 --rw=randwrite --bs=4k
> --ioengine=psync --numjobs=5 --runtime=1200 --time_based
> 
> in the guest. Then I'm issuing the QMP command with the reproducer in a
> loop. Usually, the guest IO will get stuck after about 1-3 minutes,
> sometimes fio can manage to continue with a lower speed for a while (but
> trying to Ctrl+C it or doing other IO in the guest will already be
> broken), which I guess could be a hint that it's an issue with notifiers?
> 
> Bisecting (to declare a commit good, I waited 10 minutes) led me to this
> patch, i.e. commit 1665d9326f ("virtio-blk: implement
> BlockDevOps->drained_begin()") and for SCSI, I verified that the issue
> similarly starts happening after 766aa2de0f ("virtio-scsi: implement
> BlockDevOps->drained_begin()").
> 
> Both issues are still present on current master (i.e. 1c98a821a2
> ("tests/qtest: Introduce tests for AMD/Xilinx Versal TRNG device"))
> 
> Happy to provide more information and hints about how to debug the issue
> further.
> 

I think I was finally able to get to the bottom of this and have a
plausible-sounding pet theory now. It involves the VirtIO notifier
optimization during poll mode.

Let's step through some debug prints I added. First number is always the
thread ID (I'm sorry that I used warn_report rather than proper tracing):

> 247050 nodefd 29 poll_set_started 1

The iothread starts poll mode for the node with fd 29 which is the
virtio host notifier.

> 247050 0x55e515185270 poll begin for vq
> 247050 0x55e515185270 setting notification for vq 0

virtio_queue_set_notification is called to disable notification.

> 247050 nodefd 29 poll_set_started 1 done
> 247050 0x55e515185270 handle vq suppress_notifications 0 num_reqs 1
> 247050 0x55e515185270 handle vq suppress_notifications 0 num_reqs 4

virtio-blk handling some requests, note that suppress_notifications is 0
because we are in poll mode.

> 247048 nodefd 29 addr 0x55e51496ed70 marking as deleted

Main thread marks the node for deletion when beginning drain, i.e.
detaches the host notifier.

> 247048 nodefd 29 addr 0x55e513cdcd20 is_new 1 adding node

Main thread adds a new node when ending drain, i.e. attaches the host
notifier.

> 247050 nodefd 29 addr 0x55e51496ed70 remove deleted handler

The iothread removes the handler marked for removal. In particular from
the node_poll list: QLIST_SAFE_REMOVE(node, node_poll);

> 247050 disabling poll mode before fdmon_ops->wait

This is just before the call to
poll_set_started(ctx, &ready_list, false)

Whoops!! Nobody ends poll mode for the node with fd 29, because the old
node was deleted from the node_poll list already and new node is not
part of it, i.e. nobody has started poll mode for the new node.

> 247050 0x55e515185270 handle vq suppress_notifications 0 num_reqs 0

fdmon_ops->wait() returns one last time (not sure why) but no actual
requests.

> 247050 disabling poll mode before fdmon_ops->wait

After this, the fdmon_ops->wait() (it's fdmon_poll_wait in my case) will
just wait forever (or until triggering QMP 'stop' and 'cont' which
restarts the dataplane).


A minimal workaround seems to be either calling
event_notifier_set(virtio_queue_get_host_notifier(vq));
or
virtio_queue_set_notification(vq, true);
in drainded_end (for both VirtIO SCSI/block).

But is this an actual issue with the AIO interface/implementation? Or
should it rather be considered a bug in the VirtIO SCSI/block drain
implementation, because of the notification optimization?

Best Regards,
Fiona

> 
> [0]:
> 
>> diff --git a/blockdev.c b/blockdev.c
>> index db2725fe74..bf2e0fc22c 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -2986,6 +2986,11 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>>      bool zero_target;
>>      int ret;
>>  
>> +    bdrv_drain_all_begin();
>> +    bdrv_drain_all_end();
>> +    return;
>> +
>> +
>>      bs = qmp_get_root_bs(arg->device, errp);
>>      if (!bs) {
>>          return;
> 
> 
>

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained_begin(), Fiona Ebner <=
- Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained_begin(), Kevin Wolf, 2023/12/08
  - Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained_begin(), Fiona Ebner, 2023/12/11
    - Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained_begin(), Stefan Hajnoczi, 2023/12/13

Prev by Date: Re: Build qemu without USB,network devices
Next by Date: Re: [PATCH v2 3/3] hw/virtio: rename virtio dmabuf API
Previous by thread: Request for New PPC Machine Supporting Multiple SMP Cores
Next by thread: Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained_begin()
Index(es):
- Date
- Thread