qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] Proper usage of the SPAPR vIOMMU with VFIO?


From: David Gibson
Subject: Re: [Qemu-ppc] Proper usage of the SPAPR vIOMMU with VFIO?
Date: Wed, 1 May 2019 12:00:58 +1000
User-agent: Mutt/1.11.3 (2019-02-01)

On Tue, Apr 30, 2019 at 12:12:42PM +1000, Alexey Kardashevskiy wrote:
> 
> 
> On 30/04/2019 07:42, Shawn Anastasio wrote:
> > Hello David,
> > 
> > Thank you very much! Following your advice I created a separate
> > spapr-pci-host-bridge device and attached the ivshmem device to that.
> > Now I'm able to access the device through VFIO as expected!
> 
> Cool. Hosts and guests are quite different in a way they group PCI
> devices for VFIO.
> 
> > As an aside, I had to use VFIO_SPAPR_TCE_IOMMU rather than
> > VFIO_SPAPR_TCE_v2_IOMMU (I'm still not clear on the difference).
> 
> 
> VFIO_SPAPR_TCE_IOMMU allows DMA to the first 1 or 2GB of the PCI address
> space, mapped via IOMMU dynamically - Linux in the guest does frequent
> map/unmap per let's say every network packet but you could just allocate
> some memory up to that window size, map it once and use that.
> 
> VFIO_SPAPR_TCE_v2_IOMMU maps the entire guest RAM to the PCI address
> space (at some high offset == 1<<59) so the guest does not need to keep
> talking to IOMMU once such mapping is set up.

Well, the above is how qemu uses the v2_IOMMU, the situation with a
userspace driver can be different.  But the v2 IOMMU does include
facilities to allow for larger DMA windows.

> At the moment VFIO_SPAPR_TCE_v2_IOMMU is only supported on the baremetal
> (we are planning on adding support for it for guests), since you work
> with a guest, VFIO_SPAPR_TCE_IOMMU is your only choice.

Huh, I hadn't realized that.

> 
> 
> > After making this change it works as expected.
> > 
> > I have yet to test hotplugging of additional ivshmem devices (via QMP),
> > but as long as I specify the correct spapr-pci-host-bridge bus I see
> > no reason why that wouldn't work too.
> > 
> > Thanks again!
> > Shawn
> > 
> > On 4/16/19 10:05 PM, David Gibson wrote:
> >> On Thu, Apr 04, 2019 at 05:15:50PM -0500, Shawn Anastasio wrote:
> >>> Hello all,
> >>
> >> Sorry I've taken so long to reply.  I didn't spot this for a while (I
> >> only read the qemu-ppc list irregularly) and then was busy for a while
> >> more.
> >>
> >>> I'm attempting to write a VFIO driver for QEMU's ivshmem shared memory
> >>> device on a ppc64 guest. Unfortunately, without using VFIO's unsafe
> >>> No-IOMMU mode, I'm unable to properly interface with the device.
> >>
> >> So, you want to write a device in guest userspace, accessing the
> >> device emulated by ivshmem via vfio.  Is that right?
> >>
> >> I'm assuming your guest is under KVM/qemu rather than being an LPAR
> >> under PowerVM.
> >>
> >>> When booting the guest with the iommu=on kernel parameter,
> >>
> >> The iommu=on parameter shouldn't make a difference.  PAPR guests
> >> *always* have a guest visible IOMMU.
> >>
> >>> the ivshmem
> >>> device can be binded to the vfio_pci kernel module and a group at
> >>> /dev/vfio/0 appears. When opening the group and checking its flags
> >>> with VFIO_GROUP_GET_STATUS, though, VFIO_GROUP_FLAGS_VIABLE is not
> >>> set. Ignoring this and attempting to set the VFIO container's IOMMU
> >>> mode to VFIO_SPAPR_TCE_v2_IOMMU fails with EPERM, though I'm not
> >>> sure if that's related.
> >>
> >> Yeah, the group will need to be viable before you can attach it to a
> >> container.
> >>
> >> I'm guessing the reason it's not is that some devices in the guest
> >> side IOMMU group are still bound to kernel drivers, rather than VFIO
> >> (or simply being unbound).
> >>
> >> Under PAPR, an IOMMU group generally consists of everything under the
> >> same virtual PCI Host Bridge (vPHB) - i.e. an entire (guest side) PCI
> >> domain.  Or at least, under the qemu implementation of PAPR.  It's not
> >> strictly required by PAPR, but it's pretty awkward to do otherwise.
> >>
> >> So, chances are you have your guest's disk and network on the same,
> >> default vPHB, meaning it's in the same (guest) IOMMU group as the
> >> ivshmem, which means it can't be safely used by userspace VFIO.
> >>
> >> However, unlike on x86, making extra vPHBs is very straightforward.
> >> Use something like:
> >>       -device spapr-pci-host-bridge,index=1,id=phb
> >>
> >> Then add bus=phb.0 to your ivshmem to put it on the secondary PHB.  It
> >> will then be in its own IOMMU group and you should be able to use it
> >> in guest userspace.
> >>
> >> Note that before you're able to map user memory into the IOMMU, you'll
> >> also need to "preregister" it with the ioctl
> >> VFIO_IOMMU_SPAPR_REGISTER_MEMORY.  [This is because for the case of
> >> passing through a device to a guest - which always has a vIOMMU,
> >> remember - doing accounting on every VFIO_DMA_MAP, can be pretty
> >> expensive in a hot path, the preregistration step lets us preregister
> >> all guest memory, handle the accounting then, and allow the actual
> >> maps and unmaps to go faster].
> >>
> > 
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]