[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [Qemu-devel] [RFC v2 00/28] vSMMUv3/pSMMUv3 2 stage VFIO
From: |
Liu, Yi L |
Subject: |
Re: [Qemu-arm] [Qemu-devel] [RFC v2 00/28] vSMMUv3/pSMMUv3 2 stage VFIO integration |
Date: |
Fri, 19 Oct 2018 08:02:04 +0000 |
Hi Eric,
> From: Auger Eric [mailto:address@hidden
> Sent: Thursday, October 18, 2018 11:16 PM
>
> Hi Yi,
>
> On 10/18/18 12:30 PM, Liu, Yi L wrote:
> > Hi Eric,
> >
> >> From: Eric Auger [mailto:address@hidden
> >> Sent: Friday, September 21, 2018 4:18 PM
> >> Subject: [RFC v2 00/28] vSMMUv3/pSMMUv3 2 stage VFIO integration
> >>
> >> Up to now vSMMUv3 has not been integrated with VFIO. VFIO integration
> >> requires to program the physical IOMMU consistently with the guest
> >> mappings. However, as opposed to VTD, SMMUv3 has no "Caching Mode"
> >> which allows easy trapping of guest mappings.
> >> This means the vSMMUV3 cannot use the same VFIO integration as VTD.
> >>
> >> However SMMUv3 has 2 translation stages. This was devised with
> >> virtualization use case in mind where stage 1 is "owned" by the guest
> >> whereas the host uses stage 2 for VM isolation.
> >>
> >> This series sets up this nested translation stage. It only works if
> >> there is one physical SMMUv3 used along with QEMU vSMMUv3 (in other
> >> words, it does not work if there is a physical SMMUv2).
> >>
> >> The series uses a new kernel user API [1], still under definition.
> >>
> >> - We force the host to use stage 2 instead of stage 1, when we
> >> detect a vSMMUV3 is behind a VFIO device. For a VFIO device
> >> without any virtual IOMMU, we still use stage 1 as many existing
> >> SMMUs expect this behavior.
> >> - We introduce new IOTLB "config" notifiers, requested to notify
> >> changes in the config of a given iommu memory region. So now
> >> we have notifiers for IOTLB changes and config changes.
> >> - vSMMUv3 calls config notifiers when STE (Stream Table Entries)
> >> are updated by the guest.
> >> - We implement a specific UNMAP notifier that conveys guest
> >> IOTLB invalidations to the host
> >> - We implement a new MAP notifiers only used for MSI IOVAs so
> >> that the host can build a nested stage translation for MSI IOVAs
> >> - As the legacy MAP notifier is not called anymore, we must make
> >> sure stage 2 mappings are set. This is achieved through another
> >> memory listener.
> >> - Physical SMMUs faults are reported to the guest via en eventfd
> >> mechanism and reinjected into this latter.
> >>
> >> Note: some iommu memory notifier rework related patches are close to
> >> those previously published by Peter and Liu. I will be pleased to add
> >> their Signed-off-by if they agree/wish.
> >
> > Yeah, feel free to add mine if it's originated from our previous work.
> OK
> >
> > BTW. Curiously, are you planning to implement all vIOMMU related
> > operations through MemoryRegion notifiers? Honestly, I did it in such
> > way in early RFC for vSVA work. However, we encountered two issues at
> > that time. First, how to check whether the notifier should be registered.
> On my side I think I resolved this by querying the iommu mr about the new
> IOMMU_ATTR_VFIO_NESTED IOMMU attribute in vfio_connect_container
So it's to detect it by checking iommu mr? I think it is also similar to my
early
RFC.
https://patchwork.kernel.org/patch/9701003/
+ /* Check if vIOMMU exists */
+ QTAILQ_FOREACH(subregion, &as->root->subregions, subregions_link) {
+ if (memory_region_is_iommu(subregion)) {
+ IOMMUNotifier n1;
+
+ /*
+ FIXME: current iommu notifier is actually designed for
+ IOMMUTLB MAP/UNMAP. However, vIOMMU emulator may need
+ notifiers other than MAP/UNMAP, so it'll be better to
+ split the non-IOMMUTLB notifier from the current IOMMUTLB
+ notifier framewrok.
+ */
+ iommu_notifier_init(&n1, vfio_iommu_bind_pasid_tbl_notify,
+ IOMMU_NOTIFIER_SVM_PASIDT_BIND,
+ 0,
+ 0);
+ vfio_register_notifier(group->container,
+ subregion,
+ 0,
+ &n1);
+ }
+ }
For VT-d, we only have 1 iommu mr. But I was told ther may be multiple iommu mr
on other platform. So I switched to use PCISVAOps. Thoughts?
> See patches 5 and 6, 8
>
> This tells me whether the nested mode must be setup and choose the right
> container->iommu_type which is then used in vfio_listener_region_add()
> to decide whether the specific notifiers mustd to be registered.
>
> > Second, there are cases in which the vfio_listener_region_add is not
> > triggered but vIOMMU exists. Details can be got by the link below:
> > (http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg00078.html)
>
> Yes I remember this, due to the PT=1 case. On ARM we don't have this
> specificity
> hence this integration.
> >
> > As thus, we had some discussions in community. Last time, PCIPASIDOps
> > was proposed. It is to add callbacks in PCIDevice. VFIO would register
> > its implementations in vfio_realize(). Supposedly, pasid_table_bind,
> > page_table_bind, sva_tlb_invalidation_passdown and other vIOMMU
> > related operations can be done in such way. The sample patch below may
> > show how it looks like. (the full patch is in my sandbox, planned to
> > send out with Scalable Mode emulation patch).
>
> To be honest, I lost track of the series and did not see this PCIPASIDOps
> proposal. I
> will study whether this can fit my needs.
It's proposed in below link. The name at that time is PCISVAOps. I made it to be
PCIPASIDOps to be more generic.
http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg00081.html
Regards,
Yi Liu