qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/7] Enable shared device assignment


From: Jason Gunthorpe
Subject: Re: [PATCH 0/7] Enable shared device assignment
Date: Fri, 10 Jan 2025 10:14:01 -0400

On Fri, Jan 10, 2025 at 02:45:39PM +0100, David Hildenbrand wrote:
> 
> In your commit I read:
> 
> "Implement the cut operation to be hitless, changes to the page table
> during cutting must cause zero disruption to any ongoing DMA. This is the
> expectation of the VFIO type 1 uAPI. Hitless requires HW support, it is
> incompatible with HW requiring break-before-make."
> 
> So I guess that would mean that, depending on HW support, one could avoid
> disabling large pages to still allow for atomic cuts / partial unmaps that
> don't affect concurrent DMA.

Yes. Most x86 server HW will do this, though ARM support is a bit newish.

> What would be your suggestion here to avoid the "map each 4k page
> individually so we can unmap it individually" ? I didn't completely grasp
> that, sorry.

Map in large ranges in the VMM, lets say 1G of shared memory as a
single mapping (called an iommufd area)

When the guest makes a 2M chunk of it private you do a ioctl to
iommufd to split the area into three, leaving the 2M chunk as a
seperate area.

The new iommufd ioctl to split areas will go down into the iommu driver
and atomically cut the 1G PTEs into smaller PTEs as necessary so that
no PTE spans the edges of the 2M area.

Then userspace can unmap the 2M area and leave the remainder of the 1G
area mapped.

All of this would be fully hitless to ongoing DMA.

The iommufs code is there to do this assuming the areas are mapped at
4k, what is missing is the iommu driver side to atomically resize
large PTEs.

> From "IIRC you can only trigger split using the VFIO type 1 legacy API. We
> would need to formalize split as an IOMMUFD native ioctl.
> Nobody should use this stuf through the legacy type 1 API!!!!"
> 
> I assume you mean that we can only avoid the 4k map/unmap if we add proper
> support to IOMMUFD native ioctl, and not try making it fly somehow with the
> legacy type 1 API?

The thread was talking about the built-in support in iommufd to split
mappings. That built-in support is only accessible through legacy APIs
and should never be used in new qemu code. To use that built in
support in new code we need to build new APIs. The advantage of the
built-in support is qemu can map in large regions (which is more
efficient) and the kernel will break it down to 4k for the iommu
driver.

Mapping 4k at a time through the uAPI would be outrageously
inefficient.

Jason



reply via email to

[Prev in Thread] Current Thread [Next in Thread]