[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support
From: |
Eugenio Perez Martin |
Subject: |
Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support |
Date: |
Mon, 25 Mar 2024 21:33:26 +0100 |
On Mon, Mar 25, 2024 at 5:52 PM Jonah Palmer <jonah.palmer@oracle.com> wrote:
>
>
>
> On 3/22/24 7:18 AM, Eugenio Perez Martin wrote:
> > On Thu, Mar 21, 2024 at 4:57 PM Jonah Palmer <jonah.palmer@oracle.com>
> > wrote:
> >>
> >> The goal of these patches is to add support to a variety of virtio and
> >> vhost devices for the VIRTIO_F_IN_ORDER transport feature. This feature
> >> indicates that all buffers are used by the device in the same order in
> >> which they were made available by the driver.
> >>
> >> These patches attempt to implement a generalized, non-device-specific
> >> solution to support this feature.
> >>
> >> The core feature behind this solution is a buffer mechanism in the form
> >> of GLib's GHashTable. The decision behind using a hash table was to
> >> leverage their ability for quick lookup, insertion, and removal
> >> operations. Given that our keys are simply numbers of an ordered
> >> sequence, a hash table seemed like the best choice for a buffer
> >> mechanism.
> >>
> >> ---------------------
> >>
> >> The strategy behind this implementation is as follows:
> >>
> >> We know that buffers that are popped from the available ring and enqueued
> >> for further processing will always done in the same order in which they
> >> were made available by the driver. Given this, we can note their order
> >> by assigning the resulting VirtQueueElement a key. This key is a number
> >> in a sequence that represents the order in which they were popped from
> >> the available ring, relative to the other VirtQueueElements.
> >>
> >> For example, given 3 "elements" that were popped from the available
> >> ring, we assign a key value to them which represents their order (elem0
> >> is popped first, then elem1, then lastly elem2):
> >>
> >> elem2 -- elem1 -- elem0 ---> Enqueue for processing
> >> (key: 2) (key: 1) (key: 0)
> >>
> >> Then these elements are enqueued for further processing by the host.
> >>
> >> While most devices will return these completed elements in the same
> >> order in which they were enqueued, some devices may not (e.g.
> >> virtio-blk). To guarantee that these elements are put on the used ring
> >> in the same order in which they were enqueued, we can use a buffering
> >> mechanism that keeps track of the next expected sequence number of an
> >> element.
> >>
> >> In other words, if the completed element does not have a key value that
> >> matches the next expected sequence number, then we know this element is
> >> not in-order and we must stash it away in a hash table until an order
> >> can be made. The element's key value is used as the key for placing it
> >> in the hash table.
> >>
> >> If the completed element has a key value that matches the next expected
> >> sequence number, then we know this element is in-order and we can push
> >> it on the used ring. Then we increment the next expected sequence number
> >> and check if the hash table contains an element at this key location.
> >>
> >> If so, we retrieve this element, push it to the used ring, delete the
> >> key-value pair from the hash table, increment the next expected sequence
> >> number, and check the hash table again for an element at this new key
> >> location. This process is repeated until we're unable to find an element
> >> in the hash table to continue the order.
> >>
> >> So, for example, say the 3 elements we enqueued were completed in the
> >> following order: elem1, elem2, elem0. The next expected sequence number
> >> is 0:
> >>
> >> exp-seq-num = 0:
> >>
> >> elem1 --> elem1.key == exp-seq-num ? --> No, stash it
> >> (key: 1) |
> >> |
> >> v
> >> ================
> >> |key: 1 - elem1|
> >> ================
> >> ---------------------
> >> exp-seq-num = 0:
> >>
> >> elem2 --> elem2.key == exp-seq-num ? --> No, stash it
> >> (key: 2) |
> >> |
> >> v
> >> ================
> >> |key: 1 - elem1|
> >> |--------------|
> >> |key: 2 - elem2|
> >> ================
> >> ---------------------
> >> exp-seq-num = 0:
> >>
> >> elem0 --> elem0.key == exp-seq-num ? --> Yes, push to used ring
> >> (key: 0)
> >>
> >> exp-seq-num = 1:
> >>
> >> lookup(table, exp-seq-num) != NULL ? --> Yes, push to used ring,
> >> remove elem from table
> >> |
> >> v
> >> ================
> >> |key: 2 - elem2|
> >> ================
> >>
> >> exp-seq-num = 2:
> >>
> >> lookup(table, exp-seq-num) != NULL ? --> Yes, push to used ring,
> >> remove elem from table
> >> |
> >> v
> >> ================
> >> | *empty* |
> >> ================
> >>
> >> exp-seq-num = 3:
> >>
> >> lookup(table, exp-seq-num) != NULL ? --> No, done
> >> ---------------------
> >>
> >
> > I think to use a hashtable to handle this has an important drawback:
> > it hurts performance on the devices that are using right in-order
> > because of hash calculus, to benefit devices that are using it badly
> > by using descriptors out of order. We should use data structs that are
> > as free as possible for the first, and we don't care to worse the
> > experience of the devices that enable in_order and they shouldn't.
> >
>
> Right, because if descriptors are coming in in-order, we still search
> the (empty) hash table.
>
> Hmm... what if we introduced a flag to see if we actually should bother
> searching the hash table? That way we avoid the cost of searching when
> we really don't need to.
>
> > So I suggest reusing vq->used_elems array vq. At each used descriptor
> > written in the used ring, you know the next head is elem->index +
> > elem->ndescs, so you can check if that element has been filled or not.
> > If used, it needs to be flushed too. If not used, just return.
> >
> > Of course virtqueue_flush also needs to take this into account.
> >
> > What do you think, does it make sense to you?
> >
>
> I'm having a bit of trouble understanding the suggestion here. Would you
> mind elaborating a bit more for me on this?
>
> For example, say elem0, elem1, and elem2 were enqueued in-order (elem0
> being first, elem2 last) and then elem2 finishes first, elem1 second,
> and elem0 third. Given that these elements finish out-of-order, how
> would you handle these out-of-order elements using your suggestion?
>
virtqueue_fill is called first with elem2. So vq->used_elems[2 %
vq->num] is filled with the needed information of the descriptor:
index, len and ndescs. idx function parameter is ignored.
Optionally, virtqueue_push is called. It checks if
vq->used_elems[vq->used_idx] is valid. valid can be elem->in_num +
elem->out_num > 0, and reset them on every used ring write. If it is
not valid, this is a no-op. Currently, it is not valid.
Same process for elem1.
virtqueue_fill is the same for elem0. But now virtqueue_flush gets
interesting, as it detects vq->used_elems[0] is used. It scans for the
first not-used element, and it finds it is vq->used_elems[3]. So it
needs to write an used elem with id = 2 and the corresponding length.
Maybe it is interesting to implement ways to improve the look for the
last used descriptor, but if any I'd go for a bitmap and always on top
of the basis series.
The algorithm has not been tested, so maybe I've missed something.
Thanks!
> Thanks :)
>
> > Thanks!
> >
> >
> >> Jonah Palmer (8):
> >> virtio: Define InOrderVQElement
> >> virtio: Create/destroy/reset VirtQueue In-Order hash table
> >> virtio: Define order variables
> >> virtio: Implement in-order handling for virtio devices
> >> virtio-net: in-order handling
> >> vhost-svq: in-order handling
> >> vhost/vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits
> >> virtio: Add VIRTIO_F_IN_ORDER property definition
> >>
> >> hw/block/vhost-user-blk.c | 1 +
> >> hw/net/vhost_net.c | 2 +
> >> hw/net/virtio-net.c | 6 +-
> >> hw/scsi/vhost-scsi.c | 1 +
> >> hw/scsi/vhost-user-scsi.c | 1 +
> >> hw/virtio/vhost-shadow-virtqueue.c | 15 ++++-
> >> hw/virtio/vhost-user-fs.c | 1 +
> >> hw/virtio/vhost-user-vsock.c | 1 +
> >> hw/virtio/virtio.c | 103 ++++++++++++++++++++++++++++-
> >> include/hw/virtio/virtio.h | 20 +++++-
> >> net/vhost-vdpa.c | 1 +
> >> 11 files changed, 145 insertions(+), 7 deletions(-)
> >>
> >> --
> >> 2.39.3
> >>
> >
>
- Re: [RFC 4/8] virtio: Implement in-order handling for virtio devices, (continued)
[RFC 5/8] virtio-net: in-order handling, Jonah Palmer, 2024/03/21
[RFC 6/8] vhost-svq: in-order handling, Jonah Palmer, 2024/03/21
[RFC 8/8] virtio: Add VIRTIO_F_IN_ORDER property definition, Jonah Palmer, 2024/03/21
Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support, Dongli Zhang, 2024/03/21
Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support, Eugenio Perez Martin, 2024/03/22
- Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support, Jonah Palmer, 2024/03/25
- Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support,
Eugenio Perez Martin <=
- Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support, Jonah Palmer, 2024/03/26
- Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support, Eugenio Perez Martin, 2024/03/26
- Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support, Jonah Palmer, 2024/03/26