[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device
From: |
Bharat Bhushan |
Subject: |
Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device |
Date: |
Tue, 11 Jul 2017 05:54:35 +0000 |
Hi Jean,
> -----Original Message-----
> From: Jean-Philippe Brucker [mailto:address@hidden
> Sent: Friday, July 07, 2017 8:50 PM
> To: Bharat Bhushan <address@hidden>; Auger Eric
> <address@hidden>; address@hidden;
> address@hidden; address@hidden; address@hidden;
> address@hidden; address@hidden
> Cc: address@hidden; address@hidden; address@hidden;
> address@hidden; address@hidden; address@hidden;
> address@hidden; address@hidden
> Subject: Re: [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device
>
> On 07/07/17 12:36, Bharat Bhushan wrote:
> >>> In this proposal, QEMU reserves a iova-range for guest (not host) and
> guest
> >> kernel will use this as msi-iova untranslated (IOMMU_RESV_MSI). While
> this
> >> does not change host interface and it will continue to use host reserved
> >> mapping for actual interrupt generation, no?
> >> But then userspace needs to provide IOMMU_RESV_MSI range to guest
> >> kernel, right? What would be the proposed manner?
> >
> > Just an opinion, we can define feature
> (VIRTIO_IOMMU_F_RES_MSI_RANGE) and provide this info via a command
> (VIRTIO_IOMMU_T_MSI_RANGE). Guest iommu-driver will make this call
> during initialization and store the value. This value will just replace
> MSI_IOVA_BASE and MSI_IOVA_LENGHT hash define. Rest will remain same
> in virtio-iommu driver.
>
> Yes I had something similar in mind, although more generic since we'll
> need to get other bits of information from the device in future extensions
> (fault handling, page table formats and dynamic reserves of memory for
> SVM), and maybe also for finding out per-address-space page granularity
> (see my reply of patch 3/8). These are per-endpoint properties that cannot
> be advertise in the virtio config space.
>
> ***
>
> So I propose to add a per-endpoint probing mechanism on the request
> queue:
What is per-endpoint? Is it "per-pci/platform-device"?
>
> * The device advertises a new command VIRTIO_IOMMU_T_PROBE with
> feature
> bit VIRTIO_IOMMU_F_PROBE.
> * When this feature is advertised, the device sets probe_size field in the
> the config space.
Probably I did not get how virtio-iommu device emulation decides value of
"probe_size", can you share more info?
> * When device offers VIRTIO_IOMMU_F_PROBE, the driver should send an
> VIRTIO_IOMMU_T_PROBE request for each new endpoint.
> * The driver allocates a device-writeable buffer of probe_size (plus
> framing) and sends it as a VIRTIO_IOMMU_T_PROBE request.
> * The device fills the buffer with various information.
>
> struct virtio_iommu_req_probe {
> /* device-readable */
> struct virtio_iommu_req_head head;
> le32 device;
> le32 flags;
>
> /* maybe also le32 content_size, but it must be equal to
> probe_size */
Can you please describe why we need to pass size of "probe_size" in probe
request?
>
> /* device-writeable */
> u8 content[];
I assume content_size above is the size of array "content[]" and max value can
be equal to probe_size advertised by device?
> struct virtio_iommu_req_tail tail;
> };
>
> I'm still struggling with the content and layout of the probe request, and
> would appreciate any feedback. To be easily extended, I think it should
> contain a list of fields of variable size:
>
> |0 15|16 31|32 N|
> | type | length | values |
>
> 'length' might be made optional if it can be deduced from type, but might
> make driver-side parsing more robust.
>
> The probe could either be done for each endpoint, or for each address
> space. I much prefer endpoint because it is the smallest granularity. The
> driver can then decide what endpoints to put together in the same address
> space based on their individual capabilities. The specification would
> described how each endpoint property is combined when endpoints are put
> in
> the same address space. For example, take the minimum of all PASID size,
> the maximum of all page granularities, combine doorbell addresses, etc.
>
> If we did the probe on address spaces instead, the driver would have to
> re-send a probe request each time a new endpoint is attached to an
> existing address space, to see if it is still capable of page table
> handover or if the driver just combined a VFIO and an emulated endpoint by
> accident.
>
> ***
>
> Using this framework, the device can declare doorbell regions by adding
> one or more RESV fields into the probe buffer:
>
> /* 'type' */
> #define VIRTIO_IOMMU_PROBE_T_RESV 0x1
>
> /* 'values'. 'length' is sizeof(struct virtio_iommu_probe_resv) */
> struct virtio_iommu_probe_resv {
> le64 gpa;
> le64 size;
>
> #define VIRTIO_IOMMU_PROBE_RESV_MSI 0x1
> u8 type;
type is 16 bit above?
> };
>
> Such a region would be subject to the following rules:
>
> * Driver should not use any IOVA declared as RESV_MSI in a mapping.
> * Device should leave any transaction matching a RESV_MSI region pass
> through untranslated.
> * If the device does not advertise any RESV region, then the driver should
> assume that MSI doorbells, like any other GPA, must be mapped with an
> arbitrary IOVA in order for the endpoint to access them.
> * Given that the driver *should* perform a probe request if available, and
> it *should* understand the VIRTIO_IOMMU_PROBE_T_RESV field, then this
> field tells the guest how it should handle MSI doorbells, and whether it
> should map the address via MAP requests or not.
>
> Does this make sense and did I overlook something?
Overall it looks good to me. Do you have plans to implements this in
virtio-iommu driver and kvmtool?
Thanks
-Bharat
>
> Thanks,
> Jean
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, (continued)
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Michael S. Tsirkin, 2017/07/06
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Tian, Kevin, 2017/07/06
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Kalra, Ashish, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/11
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Tian, Kevin, 2017/07/14
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Auger Eric, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device,
Bharat Bhushan <=
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/11
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/11
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/12
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/12
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/12
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/12
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Auger Eric, 2017/07/06
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Auger Eric, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/07
Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Tian, Kevin, 2017/07/05