[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v3 3/3] vhost: Allocate memory for packed vring
From: |
Sahil |
Subject: |
Re: [RFC v3 3/3] vhost: Allocate memory for packed vring |
Date: |
Tue, 13 Aug 2024 01:02:03 +0530 |
Hi,
On Monday, August 12, 2024 12:01:00 PM GMT+5:30 you wrote:
> On Sun, Aug 11, 2024 at 7:20 PM Sahil <icegambit91@gmail.com> wrote:
> > On Wednesday, August 7, 2024 9:52:10 PM GMT+5:30 Eugenio Perez Martin wrote:
> > > On Fri, Aug 2, 2024 at 1:22 PM Sahil Siddiq <icegambit91@gmail.com> wrote:
> > > > [...]
> > > > @@ -726,17 +738,30 @@ void vhost_svq_start(VhostShadowVirtqueue *svq,
> > > > VirtIODevice *vdev,
> > > > svq->vring.num = virtio_queue_get_num(vdev,
> > > > virtio_get_queue_index(vq));
> > > > svq->num_free = svq->vring.num;
> > > >
> > > > - svq->vring.desc = mmap(NULL, vhost_svq_driver_area_size(svq),
> > > > - PROT_READ | PROT_WRITE, MAP_SHARED |
> > > > MAP_ANONYMOUS,
> > > > - -1, 0);
> > > > - desc_size = sizeof(vring_desc_t) * svq->vring.num;
> > > > - svq->vring.avail = (void *)((char *)svq->vring.desc + desc_size);
> > > > - svq->vring.used = mmap(NULL, vhost_svq_device_area_size(svq),
> > > > - PROT_READ | PROT_WRITE, MAP_SHARED |
> > > > MAP_ANONYMOUS,
> > > > - -1, 0);
> > > > - svq->desc_state = g_new0(SVQDescState, svq->vring.num);
> > > > - svq->desc_next = g_new0(uint16_t, svq->vring.num);
> > > > - for (unsigned i = 0; i < svq->vring.num - 1; i++) {
> > > > + svq->is_packed = virtio_vdev_has_feature(svq->vdev,
> > > > VIRTIO_F_RING_PACKED);
> > > > +
> > > > + if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_RING_PACKED)) {
> > > > + svq->vring_packed.vring.desc = mmap(NULL,
> > > > vhost_svq_memory_packed(svq),
> > > > + PROT_READ | PROT_WRITE,
> > > > MAP_SHARED | MAP_ANONYMOUS,
> > > > + -1, 0);
> > > > + desc_size = sizeof(struct vring_packed_desc) * svq->vring.num;
> > > > + svq->vring_packed.vring.driver = (void *)((char
> > > > *)svq->vring_packed.vring.desc + desc_size);
> > > > + svq->vring_packed.vring.device = (void *)((char
> > > > *)svq->vring_packed.vring.driver +
> > > > + sizeof(struct
> > > > vring_packed_desc_event));
> > >
> > > This is a great start but it will be problematic when you start
> > > mapping the areas to the vdpa device. The driver area should be read
> > > only for the device, but it is placed in the same page as a RW one.
> > >
> > > More on this later.
> > >
> > > > + } else {
> > > > + svq->vring.desc = mmap(NULL, vhost_svq_driver_area_size(svq),
> > > > + PROT_READ | PROT_WRITE, MAP_SHARED
> > > > |MAP_ANONYMOUS,
> > > > + -1, 0);
> > > > + desc_size = sizeof(vring_desc_t) * svq->vring.num;
> > > > + svq->vring.avail = (void *)((char *)svq->vring.desc +
> > > > desc_size);
> > > > + svq->vring.used = mmap(NULL, vhost_svq_device_area_size(svq),
> > > > + PROT_READ | PROT_WRITE, MAP_SHARED
> > > > |MAP_ANONYMOUS,
> > > > + -1, 0);
> > > > + }
> > >
> > > I think it will be beneficial to avoid "if (packed)" conditionals on
> > > the exposed functions that give information about the memory maps.
> > > These need to be replicated at
> > > hw/virtio/vhost-vdpa.c:vhost_vdpa_svq_map_rings.
> > >
> > > However, the current one depends on the driver area to live in the
> > > same page as the descriptor area, so it is not suitable for this.
> >
> > I haven't really understood this.
> >
> > In split vqs the descriptor, driver and device areas are mapped to RW pages.
> > In vhost_vdpa.c:vhost_vdpa_svq_map_rings, the regions are mapped with
> > the appropriate "perm" field that sets the R/W permissions in the DMAMap
> > object. Is this problematic for the split vq format because the avail ring
> > is
> > anyway mapped to a RW page in "vhost_svq_start"?
> >
>
> Ok so maybe the map word is misleading here. The pages needs to be
> allocated for the QEMU process with both PROT_READ | PROT_WRITE, as
> QEMU needs to write into it.
>
> They are mapped to the device with vhost_vdpa_dma_map, and the last
> bool parameter indicates if the device needs write permissions or not.
> You can see how hw/virtio/vhost-vdpa.c:vhost_vdpa_svq_map_ring checks
> the needle permission for this, and the needle permissions are stored
> at hw/virtio/vhost-vdpa.c:vhost_vdpa_svq_map_rings. This is the
> function that needs to check for the maps permissions.
>
I think I have understood what's going on in "vhost_vdpa_svq_map_rings",
"vhost_vdpa_svq_map_ring" and "vhost_vdpa_dma_map". But based on
what I have understood it looks like the driver area is getting mapped to
an iova which is read-only for vhost_vdpa. Please let me know where I am
going wrong.
Consider the following implementation in hw/virtio/vhost_vdpa.c:
> size_t device_size = vhost_svq_device_area_size(svq);
> size_t driver_size = vhost_svq_driver_area_size(svq);
The driver size includes the descriptor area and the driver area. For
packed vq, the driver area is the "driver event suppression" structure
which should be read-only for the device according to the virtio spec
(section 2.8.10) [1].
> size_t avail_offset;
> bool ok;
>
> vhost_svq_get_vring_addr(svq, &svq_addr);
Over here "svq_addr.desc_user_addr" will point to the descriptor area
while "svq_addr.avail_user_addr" will point to the driver area/driver
event suppression structure.
> driver_region = (DMAMap) {
> .translated_addr = svq_addr.desc_user_addr,
> .size = driver_size - 1,
> .perm = IOMMU_RO,
> };
This region points to the descriptor area and its size encompasses the
driver area as well with RO permission.
> ok = vhost_vdpa_svq_map_ring(v, &driver_region, errp);
The above function checks the value of needle->perm and sees that it is RO.
It then calls "vhost_vdpa_dma_map" with the following arguments:
> r = vhost_vdpa_dma_map(v->shared, v->address_space_id, needle->iova,
> needle->size + 1,
> (void
> *)(uintptr_t)needle->translated_addr,
> needle->perm == IOMMU_RO);
Since needle->size includes the driver area as well, the driver area will be
mapped to a RO page in the device's address space, right?
> if (unlikely(!ok)) {
> error_prepend(errp, "Cannot create vq driver region: ");
> return false;
> }
> addr->desc_user_addr = driver_region.iova;
> avail_offset = svq_addr.avail_user_addr - svq_addr.desc_user_addr;
> addr->avail_user_addr = driver_region.iova + avail_offset;
I think "addr->desc_user_addr" and "addr->avail_user_addr" will both be
mapped to a RO page in the device's address space.
> device_region = (DMAMap) {
> .translated_addr = svq_addr.used_user_addr,
> .size = device_size - 1,
> .perm = IOMMU_RW,
> };
The device area/device event suppression structure on the other hand will
be mapped to a RW page.
I also think there are other issues with the current state of the patch.
According
to the virtio spec (section 2.8.10) [1], the "device event suppression"
structure
needs to be write-only for the device but is mapped to a RW page.
Another concern I have is regarding the driver area size for packed vq. In
"hw/virtio/vhost-shadow-virtqueue.c" of the current patch:
> size_t vhost_svq_driver_area_size(const VhostShadowVirtqueue *svq)
> {
> size_t desc_size = sizeof(vring_desc_t) * svq->vring.num;
> size_t avail_size = offsetof(vring_avail_t, ring[svq->vring.num]) +
>
> sizeof(uint16_t);
>
> return ROUND_UP(desc_size + avail_size, qemu_real_host_page_size());
> }
>
> [...]
>
> size_t vhost_svq_memory_packed(const VhostShadowVirtqueue *svq)
> {
> size_t desc_size = sizeof(struct vring_packed_desc) * svq->num_free;
> size_t driver_event_suppression = sizeof(struct vring_packed_desc_event);
> size_t device_event_suppression = sizeof(struct vring_packed_desc_event);
>
> return ROUND_UP(desc_size + driver_event_suppression +
> device_event_suppression,
> qemu_real_host_page_size());
> }
The size returned by "vhost_svq_driver_area_size" might not be the actual driver
size which is given by desc_size + driver_event_suppression, right? Will this
have to
be changed too?
Thanks,
Sahil
[1]
https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html#x1-720008
- [RFC v3 0/3] Add packed virtqueue to shadow virtqueue, Sahil Siddiq, 2024/08/02
- [RFC v3 1/3] vhost: Introduce packed vq and add buffer elements, Sahil Siddiq, 2024/08/02
- [RFC v3 2/3] vhost: Data structure changes to support packed vqs, Sahil Siddiq, 2024/08/02
- [RFC v3 3/3] vhost: Allocate memory for packed vring, Sahil Siddiq, 2024/08/02
- Re: [RFC v3 3/3] vhost: Allocate memory for packed vring, Eugenio Perez Martin, 2024/08/07
- Re: [RFC v3 3/3] vhost: Allocate memory for packed vring,
Sahil <=
- Re: [RFC v3 3/3] vhost: Allocate memory for packed vring, Eugenio Perez Martin, 2024/08/13
- Re: [RFC v3 3/3] vhost: Allocate memory for packed vring, Sahil, 2024/08/21
- Re: [RFC v3 3/3] vhost: Allocate memory for packed vring, Eugenio Perez Martin, 2024/08/27
- Re: [RFC v3 3/3] vhost: Allocate memory for packed vring, Sahil, 2024/08/30
- Re: [RFC v3 3/3] vhost: Allocate memory for packed vring, Eugenio Perez Martin, 2024/08/30
Re: [RFC v3 0/3] Add packed virtqueue to shadow virtqueue, Eugenio Perez Martin, 2024/08/07