qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] Proper usage of the SPAPR vIOMMU with VFIO?


From: Alexey Kardashevskiy
Subject: Re: [Qemu-ppc] Proper usage of the SPAPR vIOMMU with VFIO?
Date: Tue, 30 Apr 2019 12:12:42 +1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1


On 30/04/2019 07:42, Shawn Anastasio wrote:
> Hello David,
> 
> Thank you very much! Following your advice I created a separate
> spapr-pci-host-bridge device and attached the ivshmem device to that.
> Now I'm able to access the device through VFIO as expected!

Cool. Hosts and guests are quite different in a way they group PCI
devices for VFIO.

> As an aside, I had to use VFIO_SPAPR_TCE_IOMMU rather than
> VFIO_SPAPR_TCE_v2_IOMMU (I'm still not clear on the difference).


VFIO_SPAPR_TCE_IOMMU allows DMA to the first 1 or 2GB of the PCI address
space, mapped via IOMMU dynamically - Linux in the guest does frequent
map/unmap per let's say every network packet but you could just allocate
some memory up to that window size, map it once and use that.

VFIO_SPAPR_TCE_v2_IOMMU maps the entire guest RAM to the PCI address
space (at some high offset == 1<<59) so the guest does not need to keep
talking to IOMMU once such mapping is set up.

At the moment VFIO_SPAPR_TCE_v2_IOMMU is only supported on the baremetal
(we are planning on adding support for it for guests), since you work
with a guest, VFIO_SPAPR_TCE_IOMMU is your only choice.


> After making this change it works as expected.
> 
> I have yet to test hotplugging of additional ivshmem devices (via QMP),
> but as long as I specify the correct spapr-pci-host-bridge bus I see
> no reason why that wouldn't work too.
> 
> Thanks again!
> Shawn
> 
> On 4/16/19 10:05 PM, David Gibson wrote:
>> On Thu, Apr 04, 2019 at 05:15:50PM -0500, Shawn Anastasio wrote:
>>> Hello all,
>>
>> Sorry I've taken so long to reply.  I didn't spot this for a while (I
>> only read the qemu-ppc list irregularly) and then was busy for a while
>> more.
>>
>>> I'm attempting to write a VFIO driver for QEMU's ivshmem shared memory
>>> device on a ppc64 guest. Unfortunately, without using VFIO's unsafe
>>> No-IOMMU mode, I'm unable to properly interface with the device.
>>
>> So, you want to write a device in guest userspace, accessing the
>> device emulated by ivshmem via vfio.  Is that right?
>>
>> I'm assuming your guest is under KVM/qemu rather than being an LPAR
>> under PowerVM.
>>
>>> When booting the guest with the iommu=on kernel parameter,
>>
>> The iommu=on parameter shouldn't make a difference.  PAPR guests
>> *always* have a guest visible IOMMU.
>>
>>> the ivshmem
>>> device can be binded to the vfio_pci kernel module and a group at
>>> /dev/vfio/0 appears. When opening the group and checking its flags
>>> with VFIO_GROUP_GET_STATUS, though, VFIO_GROUP_FLAGS_VIABLE is not
>>> set. Ignoring this and attempting to set the VFIO container's IOMMU
>>> mode to VFIO_SPAPR_TCE_v2_IOMMU fails with EPERM, though I'm not
>>> sure if that's related.
>>
>> Yeah, the group will need to be viable before you can attach it to a
>> container.
>>
>> I'm guessing the reason it's not is that some devices in the guest
>> side IOMMU group are still bound to kernel drivers, rather than VFIO
>> (or simply being unbound).
>>
>> Under PAPR, an IOMMU group generally consists of everything under the
>> same virtual PCI Host Bridge (vPHB) - i.e. an entire (guest side) PCI
>> domain.  Or at least, under the qemu implementation of PAPR.  It's not
>> strictly required by PAPR, but it's pretty awkward to do otherwise.
>>
>> So, chances are you have your guest's disk and network on the same,
>> default vPHB, meaning it's in the same (guest) IOMMU group as the
>> ivshmem, which means it can't be safely used by userspace VFIO.
>>
>> However, unlike on x86, making extra vPHBs is very straightforward.
>> Use something like:
>>       -device spapr-pci-host-bridge,index=1,id=phb
>>
>> Then add bus=phb.0 to your ivshmem to put it on the secondary PHB.  It
>> will then be in its own IOMMU group and you should be able to use it
>> in guest userspace.
>>
>> Note that before you're able to map user memory into the IOMMU, you'll
>> also need to "preregister" it with the ioctl
>> VFIO_IOMMU_SPAPR_REGISTER_MEMORY.  [This is because for the case of
>> passing through a device to a guest - which always has a vIOMMU,
>> remember - doing accounting on every VFIO_DMA_MAP, can be pretty
>> expensive in a hot path, the preregistration step lets us preregister
>> all guest memory, handle the accounting then, and allow the actual
>> maps and unmaps to go faster].
>>
> 

-- 
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]