qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/4] Allow to pass pre-created VFIO container/group to QEMU


From: Andrey Ryabinin
Subject: Re: [PATCH 0/4] Allow to pass pre-created VFIO container/group to QEMU
Date: Wed, 26 Oct 2022 15:07:32 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.0


On 10/17/22 18:21, Alex Williamson wrote:
> On Mon, 17 Oct 2022 13:54:03 +0300
> Andrey Ryabinin <arbn@yandex-team.com> wrote:
> 
>> These patches add possibility to pass VFIO device to QEMU using file
>> descriptors of VFIO container/group, instead of creating those by QEMU.
>> This allows to take away permissions to open /dev/vfio/* from QEMU and
>> delegate that to managment layer like libvirt.
>>
>> The VFIO API doen't allow to pass just fd of device, since we also need to 
>> have
>> VFIO container and group. So these patches allow to pass created VFIO 
>> container/group
>> to QEMU via command line/QMP, e.g. like this:
>>             -object vfio-container,id=ct,fd=5 \
>>             -object vfio-group,id=grp,fd=6,container=ct \
>>             -device vfio-pci,host=05:00.0,group=grp
> 
> This suggests that management tools need to become intimately familiar
> with container and group association restrictions for implicit
> dependencies, such as device AddressSpace.  We had considered this
> before and intentionally chosen to allow QEMU to manage that
> relationship.  Things like PCI bus type and presence of a vIOMMU factor
> into these relationships.
> 

This is already the case. These patches doesn't change much.
QEMU doesn't allow to adding device from one group to several address spaces.
So the management tool needs to know whether devices are in the same group or 
not
and whether QEMU will create separate address spaces for these devices or not.

E.g.
qemu-system-x86_64 -nodefaults -M q35,accel=kvm,kernel-irqchip=split \
        -device intel-iommu,intremap=on,caching-mode=on \
        -device vfio-pci,host=00:1f.3 \
        -device vfio-pci,host=00:1f.4 
qemu-system-x86_64: -device vfio-pci,host=00:1f.4: vfio 0000:00:1f.4: group 14 
used in multiple address spaces

> In the above example, what happens in a mixed environment, for example
> if we then add '-device vfio-pci,host=06:00.0' to the command line?
> Isn't QEMU still going to try to re-use the container if it exists in
> the same address space? Potentially this device could also be a member
> of the same group.  How would the management tool know when to expect
> the provided fds be released?
> 

Valid point, container indeed will be reused and second device will occupy it.
But we could make new container instead. Using several containers in one address
space won't be a problem, right?
Of course several devices from same group won't be allowed to be added in mixed 
way.


> We also have an outstanding RFC for iommufd that already proposes an fd
> passing interface, where iommufd removes many of the issues of the vfio
> container by supporting multiple address spaces within a single fd
> context, avoiding the duplicate locked page accounting issues between
> containers, and proposing a direct device fd interface for vfio.  Why at
> this point in time would we choose to expand the QEMU vfio interface in
> this way?  Thanks,
> 

It sounds nice, but iommufd is new API which doesn't exist in any kernel yet.
These patches is something that can be used on existing, already deployed 
kernels.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]